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Abstract 


Image  segmentation  is  a  fundamental  task  in  image  analysis  responsible  for  partitioning  an 
image  into  multiple  sub-regions  based  on  a  desired  feature.  Active  contours  have  been  widely 
used  as  attractive  image  segmentation  methods  because  they  always  produce  sub-regions  with 
continuous  boundaries,  while  the  kernel-based  edge  detection  methods,  e.g.  Sobel  edge  detectors, 
often  produce  discontinuous  boundaries.  The  use  of  level  set  theory  has  provided  more  flexibil¬ 
ity  and  convenience  in  the  implementation  of  active  contours.  However,  traditional  edge-based 
active  contour  models  have  been  applicable  to  only  relatively  simple  images  whose  sub-regions 
are  uniform  without  internal  edges. 

A  partial  solution  to  the  problem  of  internal  edges  is  to  partition  an  image  based  on  the 
statistical  information  of  image  intensity  measured  within  sub-regions  instead  of  looking  for 
edges.  Although  representing  an  image  as  a  piecewise-constant  or  unimodal  probability  density 
functions  produces  better  results  than  traditional  edge-based  methods,  the  performances  of 
such  methods  is  still  poor  on  images  with  sub-regions  consisting  of  multiple  components,  e.g.  a 
military  vehicle  covered  by  a  camouflage  pattern.  The  segmentation  of  this  kind  of  multispec- 
tral  images  is  even  a  more  difficult  problem.  The  object  of  this  work  is  to  develop  advanced 
segmentation  methods  which  provide  robust  performance  on  the  images  with  non-uniform  sub- 
regions. 

In  this  work,  we  propose  a  framework  for  image  segmentation  which  partitions  an  im¬ 
age  based  on  the  statistics  of  image  intensity  where  the  statistical  information  is  represented 
as  a  mixture  of  probability  density  functions  defined  in  a  multi-dimensional  image  intensity 
space.  Depending  on  the  method  to  estimate  the  mixture  density  functions,  three  active  con¬ 
tour  models  are  proposed:  unsupervised  multi-dimensional  histogram  method,  half-supervised 
multivariate  Gaussian  mixture  density  method,  and  supervised  multivariate  Gaussian  mixture 
density  method.  The  implementation  of  active  contours  is  done  using  level  sets. 

The  proposed  active  contour  models  show  robust  segmentation  capabilities  on  images  where 
traditional  segmentation  methods  show  poor  performance.  Also,  the  proposed  methods  provide 
a  means  of  autonomous  pattern  classification  by  integrating  image  segmentation  and  statistical 
pattern  classification. 
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Chapter  1 

Introduction 


In  most  image  analysis  operations,  pattern  classifiers  require  individual  objects  to  be  separated 
from  the  image,  so  the  description  of  those  objects  can  be  transformed  into  a  suitable  form 
for  computer  processing.  Image  segmentation  is  a  fundamental  task,  responsible  for  the 
separating  operation.  The  function  of  segmentation  is  to  partition  an  image  into  its  constituent 
and  disjoint  sub- regions1,  which  are  uniform  according  to  their  properties,  e.g.  intensity,  color, 
and  texture.  Segmentation  algorithms  are  generally  based  on  either  discontinuity  among  sub- 
regions,  i.e.  edges,  or  uniformity2  within  a  sub-region,  though  there  are  some  segmentation 
algorithms  relying  on  both  discontinuity  and  uniformity. 


1.1  Image  Segmentation  and  Pattern  Classification 

The  distinction  between  image  segmentation  and  pattern  classification  is  often  not  clear.  The 
function  of  segmentation  is  simply  to  partition  an  image  into  multiple  sub-regions,  while  the 
function  of  pattern  classification  is  to  identify  the  partitioned  sub-regions.  Thus,  segmentation 
and  pattern  classification  usually  function  as  separate  and  sequential  processes.  However,  they 
might  function  as  an  integrated  process  depending  on  the  image  analysis  problem  and  the  perfor¬ 
mance  of  the  segmentation  method.  In  either  way,  segmentation  critically  affects  the  results  of 
pattern  classification,  and  often  determines  the  eventual  success  or  failure  of  the  image  analysis. 

Since  segmentation  is  an  important  task  in  image  analysis,  it  is  involved  in  most  image 
analysis  applications,  particularly  those  related  to  pattern  classification,  e.g.  medical  imaging, 
remote  sensing,  security  surveillance,  military  target  detection.  The  level  to  which  segmenta¬ 
tion  is  carried  depends  on  the  problem  being  solved.  That  is,  segmentation  should  stop  when 
the  region  of  interest  (ROI)  in  the  application  have  been  isolated.  Due  to  this  property  of  prob¬ 
lem  dependence,  autonomous  segmentation  is  one  of  the  most  difficult  tasks  in  image  analysis. 
Noise  and  mixed  pixels  caused  by  the  poor  resolution  of  sensor  images  make  the  segmentation 
problem  even  more  difficult.  In  this  document,  we  propose  novel  segmentation  methods  using 
a  variational  framework,  called  active  contours. 


1  Partitions,  sub-regions ,  parts ,  sections ,  objects ,  and  segments  are  often  interchangeably  used.  The  term 
sub-regions  will  be  consistently  used  in  this  document. 

2 The  terms  uniformity  and  homogeneity  are  often  interchangeably  used.  The  term  uniformity  will  be  consis¬ 
tently  used  in  this  document. 
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1.2  Active  Contours 


Active  contours  are  connectivity-preserving  relaxation  [1]  methods,  applicable  to  the  image 
segmentation  problems.  Active  contours  have  been  used  for  image  segmentation  and  boundary 
tracking  since  the  first  introduction  of  snakes  by  Kass  et  al.  [2].  The  basic  idea  is  to  start 
with  initial  boundary  shapes  represented  in  a  form  of  closed  curves,  i.e.  contours,  and  itera¬ 
tively  modify  them  by  applying  shrink/expansion  operations  according  to  the  constraints  of 
the  image.  Those  shrink/expansion  operations,  called  contour  evolution ,  are  performed  by  the 
minimization  of  an  energy  function  like  traditional  region-based  segmentation  methods  or  by 
the  simulation  of  a  geometric  partial  differential  equation  (PDE)  [3] . 

An  advantage  of  active  contours  as  image  segmentation  methods  is  that  they  partition  an 
image  into  sub-regions  with  continuous  boundaries,  while  the  edge  detectors  based  on  threshold 
or  local  filtering,  e.g.  Canny  [4]  or  Sobel  operator,  often  result  in  discontinuous  boundaries.  The 
use  of  level  set  theory  has  provided  more  flexibility  and  convenience  in  the  implementation  of 
active  contours.  Depending  on  the  implementation  scheme,  active  contours  can  use  various 
properties  used  for  other  segmentation  methods  such  as  edges,  statistics,  and  texture.  In  this 
document,  the  proposed  active  contour  models  using  the  statistical  information  of  image  inten¬ 
sity  within  a  sub-region. 


1.3  Terminology 

A  multispectral  image3  is  defined  as  a  function  on  a  two-dimensional  spatial  domain  C,  given 

by 

I  (1.1) 

where  the  input  of  the  function  is  a  two-dimensional  vector  denoting  the  coordinates  (x,  y),  and 
the  output  is  a  vector- valued  image  intensity  I  E  ?HB .  B  denotes  the  dimension  of  I,  and  is 
equivalent  to  the  number  of  spectral  bands.  Figure  1.1  shows  an  example  of  a  £>-band  image. 
In  this  document,  we  define  a  multispectral  image  as  a  general  form  of  images  and  a  scalar 


Figure  1.1:  A  multispectral  image  I (x,y):  D  — ►  5ft B 

3Four  terms,  multispectral  images ,  multi-channel  images ,  vector-valued  images  [5],  and  multi-valued  images  [6], 
are  often  interchangeably  used.  Only  the  term  multispectral  images  will  be  used  in  this  document  to  avoid  a 
confusion  between  vector-valued  images  and  vector  (format)  images  [7].  Every  image  function  introduced  in  this 
document  forms  a  bitmap  image. 
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image  as  a  particular  case  of  multispectral  images  when  B  —  1.  The  most  common  example 
of  multispectral  images  is  an  RGB  image,  consisting  of  three  spectral  bands:  red,  green,  and 
blue.  Hyperspectral  images,  used  in  remote  sensing,  are  other  examples  of  multispectral  im¬ 
ages.  A  set  of  images,  measured  by  physically  different  sensors  and  registered,  are  also  examples 
of  multispectral  images.  As  sensor  fusion  approaches  become  more  popular  in  industrial  and 
medical  imaging,  there  will  be  more  chances  to  encounter  multispectral  images  in  image  analysis. 

In  image  processing,  particularly  medical  image  processing,  modality  refers  to  the  type  of 
input  [8],  such  as  the  type  of  sensors,  e.g.  MRI,  CT,  or  the  bandwidth  of  spectrum.  Thus, 
a  multimodal  image  often  refers  to  a  multispectral  image,  where  each  channel  is  measured  by 
different  modalities.  In  statistical  pattern  classification,  a  statistical  distribution  consisting  of, 
or  representable  by,  multiple  sub-classes4  is  called  multimodal  distribution ,  while  the  other  case 
is  called  unimodal  distribution.  Figure  1.2  shows  examples  of  the  two  cases.  The  probability 


Figure  1.2:  An  example  of  unimodal  (solid)  and  multimodal  (dotted)  distributions 

density  function  (PDF)  p(/),  presented  as  the  solid  curve,  shows  a  unimodal  distribution,  while 
the  mixture  density  function  p(I )  =  ap\(I)  +  (1  —  a)p2(I),  presented  as  the  dotted  curve,  shows 
a  multimodal  distribution.  A  multi-dimensional  statistical  distribution,  e.g.  a  two-dimensional 
Gaussian  distribution,  is  called  a  multivariate  distribution  instead  of  a  multimodal  distribution. 
Note  that  the  terminology  of  modality  is  different  in  image  processing  and  statistical  pattern 
classification.  In  this  document,  we  use  the  terminology  used  in  statistical  pattern  classifica¬ 
tion,  so  unimodal  or  multimodal  refers  to  the  statistical  property  of  image  intensity  whether 
composed  of  a  single  class  or  multiple  sub-classes,  and  univariate  or  multivariate  refers  to  the 
dimensionality  of  image  intensity  such  as  a  scalar  image  or  a  multispectral  image. 


1.4  Motivations 

Although  most  image  segmentation  methods  as  well  as  active  contours  assume  that  each  sub- 
region  in  the  image  has  a  uniform  property,  we  often  encounter  images  with  non-uniform  sub- 
regions.  Figure  1.3  shows  examples  of  an  image  with  uniform  and  non-uniform  sub-regions. 
The  zebra  shown  in  figure  1.3(b)  consists  of  white  and  black  strips,  while  the  triangle  shown 
in  figure  1.3(a)  has  a  uniform  image  intensity.  The  statistical  distribution  of  image  intensity 

4 Three  terms  sub- classes,  components ,  and  modes  are  often  interchangeably  used.  The  term  sub- classes  will 
be  consistently  used  in  this  document. 
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(a)  (b) 

Figure  1.3:  Examples  of  gray  images  with  uniform  and  non-uniform  sub-regions:  (a)  a  triangle 
with  uniform  intensity,  (b)  a  zebra  with  white  and  black  strips 


within  the  triangle  and  zebra  would  be  similar  to  the  graph  shown  in  figure  1.2.  Since  the 
statistical  distribution  of  image  intensity  within  the  zebra  has  at  least  two  modes,  i.e.  one  for 
black  stripes  and  the  other  for  white  stripes,  the  segmentation  method  should  be  able  to  recog¬ 
nize  a  mixture  of  sub-classes  as  the  class  representing  the  zebra.  Otherwise,  the  segmentation 
method  would  produce  an  over-segmentation  result  separating  the  white  and  black  stripes  or 
an  under-segmentation  result  not  separating  the  white  stripes  and  the  background.  For  this 
kind  of  problems,  we  propose  advanced  image  segmentation  methods  using  the  statistical  in¬ 
formation  of  image  intensity,  where  the  statistical  distribution  of  image  intensity  is  represented 
as  a  mixture  density  function. 

The  image  segmentation  problem  of  images  with  non-uniform  sub-regions  becomes  even 
more  difficult  when  the  image  has  multiple  bands.  Figure  1.4  shows  examples  of  an  RGB  image 
with  uniform  and  non-uniform  sub-regions.  The  toy  car  shown  in  figure  1.4(a)  is  painted  with 


(a)  (b) 


Figure  1.4:  Examples  of  a  multispectral  image  with  uniform  and  non-uniform  sub-regions:  (a) 
a  toy  car  painted  gray  (RGB),  (b)  a  toy  tank  covered  by  a  camouflage  pattern  (RGB) 

uniform  gray  color,  and  the  background  is  also  uniform.  The  toy  tank  shown  in  figure  1.4(b)  is 
painted  with  multiple  different  colors  due  to  the  camouflage  pattern.  The  statistical  distribu¬ 
tion  of  vector-valued  image  intensity  within  the  toy  car  has  a  single  mode,  while  the  statistical 
distribution  of  vector- valued  image  intensity  within  the  toy  tank  has  multiple  modes  in  a  multi- 
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dimensional  image  intensity  space.  For  this  kinds  of  problems,  we  need  to  estimate  the  statistics 
of  vector-valued  image  intensity  as  a  mixture  of  multivariate  density  functions.  As  the 
estimation  of  a  multivariate  mixture  density  function  is  a  tough  problem  and  requires  high 
computation,  image  segmentation  using  the  multivariate  mixture  density  functions  is  even  a 
more  difficult  problem.  We  propose  smart  ways  to  deal  with  this  problem. 

Image  segmentation  has  often  been  considered  as  a  preprocessing  of  pattern  classification, 
but  they  are  not  necessarily  separate  procedures  in  the  case  of  statistical  pattern  classification 
and  region-based  segmentation.  A  few  active  contour  models  [9,  5,  10]  have  integrated  those 
two  procedures  as  an  unsupervised  segmentation,  which  partitions  an  image  based  on  the  statis¬ 
tics  of  image  intensity  within  each  subset.  We  propose  two  methods  which  integrate  image 
segmentation  and  statistical  pattern  classification  as  a  supervised  segmentation,  which 
partitions  an  image  based  on  the  image  intensity  at  each  pixel  and  the  statistical  information 
of  training  samples.  This  integration  reduces  the  processing  time  and  helps  to  build  an  au¬ 
tonomous  pattern  recognition  system. 

The  proposed  active  contour  models  are  aimed  at  providing  robust  segmentation  results  for 
complicated  image  analysis,  i.e.  multispectral  images  with  non-uniform  sub-regions,  but  they 
are  also  applicable  to  any  image  segmentation  problem.  Possible  applications  are  multi-sensor 
radiology  in  medical  imaging,  hyperspectral  image  segmentation  in  remote  sensing,  and  color 
image  segmentation. 
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Chapter  2 

Image  Segmentation:  background 


There  are  two  main  approaches  in  image  segmentation:  edge-  and  region-  based.  Edge-based 
segmentation  partitions  an  image  based  on  discontinuities  among  sub-regions,  while  region- 
based  segmentation  does  the  same  function  based  on  the  uniformity  of  a  desired  property 
within  a  sub-region.  In  this  chapter,  we  briefly  discuss  existing  image  segmentation  technolo¬ 
gies  as  background. 


2.1  Edge-based  Segmentation 

Edge-based  segmentation  looks  for  discontinuities  in  the  intensity  of  an  image.  It  is  more  likely 
edge  detection  or  boundary  detection  rather  than  the  literal  meaning  of  image  segmentation, 
introduced  in  chapter  1.  An  edge  can  be  defined  as  the  boundary  between  two  regions  with  rel¬ 
atively  distinct  properties.  The  assumption  of  edge-based  segmentation  is  that  every  sub-region 
in  an  image  is  sufficiently  uniform  so  that  the  transition  between  two  sub-regions  can  be  deter¬ 
mined  on  the  basis  of  discontinuities  alone.  When  this  assumption  is  not  valid,  region-based 
segmentation,  discussed  in  the  next  section,  usually  provides  more  reasonable  segmentation 
results. 

Basically,  the  idea  underlying  most  edge-detection  techniques  is  the  computation  of  a  local 
derivative  operator.  The  gradient  vector  of  an  image  I{x,y),  given  by 

:  fi  -*•  3?2  ,  (2.1) 

is  obtained  by  the  partial  derivatives  dl/dx  and  dl/dy  at  every  pixel  location.  The  local 
derivative  operation  can  be  done  by  convolving  an  image  with  kernels  shown  in  figure  2.1.  The 


(a) 

Figure  2.1:  Examples  of  gradient  kernels  along:  (a)  vertical  direction,  (b)  horizontal  direction 

magnitude  of  the  first  derivative 

|VI|  =  y/(dl/dx)2  +  ( dl/dyT  :  ft  -►  3? 


-1  0  1 

(b) 


-1 

y 

T 


vi  = 


dl /dx 
dl/dy 


(2.2) 


determines  the  presence  of  edges  in  an  image1. 


The  Laplacian  of  an  image  function  /(x,  y)  is  the  sum  of  the  second-order  derivatives,  defined 


as 


V2/  = 


d2I  d2I 
dx 2  dy 2 


(2.3) 


The  general  use  of  the  Laplacian  is  in  finding  the  location  of  edges  using  its  zero-crossings  [11]. 
A  critical  disadvantage  of  the  gradient  operation  is  that  the  derivative  enhances  noise.  As 
a  second-order  derivative,  the  Laplacian  is  even  more  sensitive  to  noise.  An  alternative  is 
convolving  an  image  with  the  Laplacian  of  a  Gaussian  (LoG)  function  [12],  given  by 


LoG(x,y ) 


1 

7TCT4 


x2  +  y2 
2a2 


exp(— 


x2  +  y2 
2a2 


(2.4) 


where  a  two-dimensional  Gaussian  function  with  the  standard  deviation  a  is  defined  as 

G(x,y)  =  ^~2exP(~~  2^/  )  :  n  ^  ft-  (2-5) 

The  LoG  function  produces  smooth  edges  as  the  Gaussian  filtering  provides  smoothing  ef¬ 
fect  [13]. 


Sobel  operation  is  performed  by  convolving  an  image  with  kernels  shown  in  figure  2.2.  Sobel 
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Figure  2.2:  Sobel  operators  along:  (a)  vertical  direction,  (b)  horizontal  direction 

operators  have  the  advantage  of  providing  both  a  derivative  and  a  smoothing  effect  [11,  14]. 
The  smoothing  effect  is  a  particularly  attractive  feature  of  the  Sobel  operators  compared  to  the 
gradient  kernels  shown  in  figure  2.1  because  the  derivative  enhances  noise. 


Canny  edge  detector  [15,  4]  is  based  on  the  extrema  of  the  first  derivative  of  the  Gaus¬ 
sian  operator  applied  to  an  image.  The  operator  first  smoothes  the  image  to  eliminate  noise, 
and  then  finds  high  gradient  regions.  After  non-maximum  suppression,  the  edges  are  finally 
determined  by  two  thresholds,  i.e.  rmin  and  Tmax  as  shown  in  table  2.1.  Canny  edge  detector 
is  known  as  an  optimal  edge  detector  because  it  satisfies  the  criteria  of  low  error  rate,  good 
localization  of  edge  points,  and  a  single  response  to  a  single  edge  pixel  [16]. 


Edge  detection  by  gradient  operations  generally  work  well  only  in  the  images  with  sharp 
intensity  transitions  and  relatively  low  noise.  Due  to  its  sensitivity  to  noise,  some  smoothing  op¬ 
eration  is  generally  required  as  preprocessing,  and  the  smoothing  effect  consequently  blurs  the 
edge  information.  However,  the  computational  cost  is  relatively  lower  than  other  segmentation 
methods  because  the  computation  can  be  done  by  a  local  filtering  operation,  i.e.  convolution 

1  Although  the  literal  meaning  of  the  term  gradient  is  the  gradient  vector  V/,  it  often  refers  to  the  magnitude 
of  gradient  |V/|  in  many  publications. 
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Table  2.1:  Path  searching  in  Canny  edge  detector 

•  If  \X7I(x,y)\  >  Tmax ,  then  I(x,y)  is  an  edge  pixel. 

•  If  Tmin  <C  \S7I(x,y)\  <  Tmax, 

—  If  there  is  a  path  from  (x,y)  to  neighbor  (N)  and  |V/(N)|  >  rmin , 
then  /(x,  y )  is  an  edge  pixel. 

—  Otherwise,  I(x,y)  is  a  non-edge  pixel. 

•  If  \VI(x,y)\  <  Tmin 5  then  I(x,y )  is  a  non-edge  pixel. 


of  an  image  with  a  kernel.  Edge-based  active  contour  models,  discussed  in  section  3.3,  use  the 
magnitude  of  gradient  |V/|  to  determine  the  position  of  edges. 


2.2  Region-based  Segmentation 

Region-based  segmentation  looks  for  uniformity  within  a  sub-region,  based  on  a  desired  prop¬ 
erty,  e.g.  intensity,  color,  and  texture.  Clustering  techniques  encountered  in  pattern  classifica¬ 
tion  literature  have  similar  objectives  and  can  be  applied  for  image  segmentation  [17]. 

Region  growing  [18]  is  a  technique  that  merges  pixels  or  small  sub-regions  into  a  larger  sub¬ 
region.  The  simplest  implementation  of  this  approach  is  pixel  aggregation  [11],  which  starts  with 
a  set  of  seed  points  and  grows  regions  from  these  seeds  by  appending  neighboring  pixels  if  they 
satisfy  the  given  criteria.  Figure  2.3  shows  a  simple  example  of  pixel  aggregation.  Segmentation 
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Figure  2.3:  Pixel  aggregation:  (a)  original  image  with  seeds  underlined;  (b)  segmentation  result 
with  t  —  4 

starts  with  two  initial  seeds,  and  then  the  regions  grow  if  they  satisfy  a  criterion  such  as 

|  I(x,  y )  —  I  (seed)  |  <  r  .  (2.6) 

Despite  the  simple  nature  of  the  algorithm,  there  are  fundamental  problems  in  region  growing: 
the  selection  of  initial  seeds  and  suitable  properties  to  grow  the  regions.  Selecting  initial  seeds 
can  be  often  based  on  the  nature  of  applications  or  images.  For  example,  the  ROI  is  generally 
brighter  than  the  background  in  IR  images.  In  this  case,  choosing  bright  pixels  as  initial  seeds 
would  be  a  proper  choice. 


Additional  criteria  that  utilize  properties  to  grow  the  regions  lead  region  growing  into  more 
sophisticated  methods,  e.g.  region  competition.  Region  competition  [19,  20]  merges  adjacent 
sub-regions  under  criteria  involving  the  uniformity  of  regions  or  sharpness  of  boundaries.  Strong 
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criteria  tend  to  produce  over-segmented  results,  while  weak  criteria  tend  to  produce  poor  seg¬ 
mentation  results  by  over-merging  the  sub-regions  with  blurry  boundaries.  An  alternative  of 
region  growing  is  split- and-merge  [21],  which  partitions  an  image  initially  into  a  set  of  arbitrary, 
disjointed  sub-regions,  and  then  merge  and/or  split  the  sub-regions  in  an  attempt  to  satisfy  the 
segmentation  criteria. 

Another  common  approach  in  region-based  segmentation  is  characterizing  statistical  uni¬ 
formity  of  sub-regions  using  parametric  models,  so  called  statistical  estimation.  With  this 
approach,  two  sub-regions  are  considered  to  be  uniform,  and  consequently  merged,  if  they  can 
be  represented  by  a  single  instance  of  the  model,  i.e.  if  they  have  common  parameter  values 
within  a  threshold.  In  practice,  the  parameters  of  a  sub-region  cannot  be  observed  directly  but 
can  only  be  inferred  from  the  observed  data  and  the  knowledge  of  the  imaging  process.  In  sta¬ 
tistical  approaches,  this  inference  is  often  made  using  Bayes’s  rule  [22]  and  the  conditional  PDF 
p(l(x,  y)\0m),  which  presents  the  conditional  probability  that  certain  data  I (x,y)  (or  statistics 
derived  from  the  data)  will  be  observed,  given  that  sub-region  m  has  the  parameter  values  of 
0m.  In  typical  statistical  region  merging  algorithms  [23],  stochastic  estimates  in  the  parameter 
space  are  obtained  for  different  sub-regions,  and  merging  decisions  are  based  on  the  similarity 
of  these  parameters. 

A  limitation  of  most  estimation-based  segmentation  methods  is  that  they  do  not  explicitly 
represent  the  uncertainty  in  the  estimated  parameter  values  and,  therefore,  are  prone  to  error 
when  parameter  estimates  are  poor.  A  Bayesian  probability  of  homogeneity  directly  exploits  all 
of  the  information  contained  in  the  statistical  image  models,  instead  of  estimating  parameter 
values  [24] .  The  probability  of  homogeneity  is  based  on  the  ability  to  formulate  a  prior  proba¬ 
bility  density  on  the  parameter  space,  and  measures  homogeneity  by  taking  the  expectation  of 
the  data  likelihood  over  a  posterior  parameter  space. 

Image  segmentation  is  often  approached  by  edge-preserving  smoothing  operations  as  well 
as  the  partitioning  operation.  Edge-preserving  smoothing  techniques  can  be  classified  roughly 
two  approaches  [25]:  Markov  random  field  (MRF)  including  energy-based  methods  [26,  27] 
and  diffusion-based  methods  [28,  29].  Both  approaches  show  similar  restoration  characteris¬ 
tics  because  the  diffusion-based  methods  can  be  viewed  as  an  energy-based  method  that  uses 
only  the  prior  energy  term  at  a  given  temperature  [30].  Snyder  et  al.  [31,  32,  33]  proposed 
an  edge-preserving  smoothing  method  for  image  segmentation  based  on  the  technology  called 
mean  field  annealing  (MFA)  [30,  34,  35,  36,  37,  38],  and  the  same  segmentation  method  was 
extended  to  vector- valued  images  by  Han  et  al.  [25,  39].  MFA  is  an  energy-based  method  for 
finding  the  minimum  of  complex  functions  which  typically  have  many  minima  [40].  For  the 
image  segmentation  problem,  a  proper  energy  function  is  defined  intending  to  keep  the  edges 
and  to  smooth  the  rest  of  areas  in  the  image.  The  segmentation  is  performed  by  minimizing 
the  energy  function  using  MFA.  MFA  approximates  a  stochastic  algorithm  called  simulated 
annealing  (SA)  [41],  which  has  shown  to  converge  to  the  global  minimum,  even  for  non-convex 
problems  [42].  Hiriyannaiah  et  al.  [43]  derived  MFA  using  the  analogy  to  physics  for  the  restora¬ 
tion  of  piecewise-constant  images,  and  Bilbro  et  al.  [42]  did  the  same  job  applying  the  MFA  to 
the  images  with  varying  gray  values. 

Region-based  approaches  are  generally  less  sensitive  to  noise,  and  usually  produce  more 
reasonable  segmentation  results  as  they  rely  on  global  properties  rather  than  local  properties, 
but  their  implementation  complexity  and  computational  cost  can  be  often  quite  large.  Statis- 
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tical  segmentation  methods,  both  estimation-based  and  Bayesian-based,  have  been  extended 
to  many  active  contour  models  including  the  proposed  models.  Those  active  contour  models 
based  on  statistical  segmentation  will  be  discussed  in  section  3.4. 


2.3  Other  Segmentation  Methods 

The  watershed  algorithm  [46,  47]  is  a  morphology-based  segmentation  method  [48,  49,  50].  It 
is  based  on  the  assumption  that  any  gray-tone  image  can  be  considered  as  a  topographic  sur¬ 
face  [51].  If  we  flood  this  surface  from  its  minima  preventing  the  merge  of  the  waters  coming 
from  different  sources,  the  surface  is  eventually  separated  as  two  different  sets:  the  catchment 
basins  and  the  watershed  lines.  If  we  apply  this  transformation  to  the  magnitude  of  image 
gradient  |V/|,  the  catchment  basins  correspond  to  the  uniform  sub-regions  in  the  image  and 
the  watershed  lines  correspond  to  the  edges.  The  flooding  operation  is  simulated  using  mor¬ 
phological  distance  operators  [52,  53,  54]. 

Fusions  of  different  principles  have  produced  good  results.  There  have  been  a  few  approaches 
to  integrate  region-  and  edge-based  segmentation  [55,  56],  and  also  an  approach  to  integrate 
region-  and  morphology-based  segmentation  called  watersnakes  [57]. 

Texture  is  another  feature  that  we  can  use  to  determine  the  segmentation  criteria.  Images 
can  be  considered  as  either  a  collection  of  pixels  in  the  spatial  domain  or  the  sum  of  sinusoids  of 
infinite  extent  in  the  spatial-frequency  domain.  Gabor  observed  that  the  spatial  representation 
and  the  spatial-frequency  representation  are  just  opposite  extremes  of  a  continuum  of  possible 
joint  space/spatial- frequency  representations  [58].  In  a  joint  space/spatial- frequency  represen¬ 
tations  for  images,  frequency  is  considered  as  a  local  phenomenon  that  can  vary  with  position 
throughout  the  image.  The  human  visual  system  is  performing  a  form  of  local  spatial-frequency 
analysis  on  the  retinal  image,  and  the  analysis  is  done  by  a  bank  of  bandpass  filters  [59]. 

The  same  approach  can  be  used  to  partition  textured  images  in  image  analysis.  Perceptually 
significant  texture  differences  presumably  correspond  to  differences  in  the  local  spatial-frequency 
content  using  the  space/spatial- frequency  paradigm.  Texture  segmentation  is  done  by  two 
steps:  decomposing  an  image  into  a  joint  space/spatial- frequency  representation  with  a  bank 
of  bandpass  filters  and  using  this  information  to  locate  the  regions  of  similar  local  spatial- 
frequency  content.  The  response  of  the  filter  bank  generates  a  kind  of  multispectral  images, 
where  each  band  represents  the  response  of  the  textured  image  at  a  particular  spatial-frequency 
bandwidth.  The  multi-channel  filtering  has  been  implemented  by  the  convolution  of  the  image 
with  a  stack  of  two-dimensional  Gabor  filters  [60,  61,  62,  63,  64]  or  wavelets  [65,  66]. 
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Chapter  3 

Active  Contours:  background 


The  technique  of  active  contours  has  become  quite  popular  for  a  variety  of  applications,  partic¬ 
ularly  image  segmentation  and  motion  tracking,  during  the  last  decade.  This  methodology  is 
based  upon  the  utilization  of  deformable  contours  which  conform  to  various  object  shapes  and 
motions.  This  chapter  provides  a  theoretical  background  of  active  contours  and  an  overview  of 
existing  active  contour  methods. 

There  are  two  main  approaches  in  active  contours  based  on  the  mathematic  implementation: 
snakes  and  level  sets .  Snakes  explicitly  move  predefined  snake  points  based  on  an  energy  min¬ 
imization  scheme,  while  level  set  approaches  move  contours  implicitly  as  a  particular  level  of  a 
function.  More  details  about  these  two  approaches  will  be  discussed  respectively  in  section  3.1 
and  3.2.  As  image  segmentation  methods,  there  are  two  kinds  of  active  contour  models  accord¬ 
ing  to  the  force  evolving  the  contours:  edge-  and  region-based.  Edge-based  active  contours  use 
an  edge  detector,  usually  based  on  the  image  gradient,  to  find  the  boundaries  of  sub-regions  and 
to  attract  the  contours  to  the  detected  boundaries.  Edge-based  approaches  are  closely  related 
to  the  edge-based  segmentation  discussed  in  section  2.1.  Region-based  active  contours  use  the 
statistical  information  of  image  intensity  within  each  subset  instead  of  searching  geometrical 
boundaries.  Region-based  approaches  are  also  closely  related  to  the  region-based  segmentation 
discussed  in  section  2.2.  More  details  of  these  two  active  contour  models  are  respectively  dis¬ 
cussed  in  section  3.3  and  section  3.4. 


3.1  Snakes 

The  first  model  of  active  contour  was  proposed  by  Kass  et  al.  [2]  and  named  snakes  due  to  the 
appearance  of  contour1  evolution.  Let  us  define  a  contour  parameterized  by  arc  length  s  as 

C(s)  =  {(x(s), y(s))  :  0  <  s  <  L}  :  3?  — >■  ST,  (3.1) 

where  L  denotes  the  length  of  the  contour  (7,  and  denotes  the  entire  domain  of  an  im¬ 
age  I(x,y).  The  corresponding  expression  in  a  discrete  domain  approximates  the  continuous 
expression  as 

C(s)  &  C(n)  —  {(x(n),  y(n))  :  0  <  n  <  TV,  5  =  0  +  nAs},  (3.2) 

1  Although  snakes  can  be  defined  as  opened  curves,  we  are  interested  in  only  the  case  of  closed  curves,  i.e. 
contours  (7(0)  =  (7(L),  because  our  objective  is  image  segmentation. 


13 


where  L  =  N As.  An  energy  function  E(C)  can  be  defined  on  the  contour  such  as 


E(C)  —  Eint  +  Eext ,  (3-3) 

where  Eint  and  Eext  respectively  denote  the  internal  energy  and  external  energy  functions. 
The  internal  energy  function  determines  the  regularity,  i.e.  smooth  shape,  of  the  contour.  A 
common  choice  for  the  internal  energy  is  a  quadratic  functional  given  by 

Eint  =  [L  a\C'{s)\2  +  (3\C"{s)\2ds  (3.4) 

J  0 

*  E(«|C>)|2  +  /?|C>)|2)AS. 

n=0 

Here  a  controls  the  tension  of  the  contour,  and  /3  controls  the  rigidity  of  the  contour.  The 
external  energy  term  determines  the  criteria  of  contour  evolution  depending  on  the  image 
I(x,y),  and  can  be  defined  as 

fL  N 

Eext=  Eimg(C(s))ds^  Y^Eimg(C(n))As  ,  (3.5) 

n= 0 

where  Eimg(x,y)  denotes  a  scalar  function  defined  on  the  image  plane,  so  the  local  minimum 
of  Eimg  attracts  the  snakes  to  edges.  A  common  example  of  the  edge  attraction  function  is  a 
function  of  image  gradient,  given  by 


Eimg(x,y)-  A|VGW(x>J/)|  (3‘6) 

where  Ga  denotes  a  Gaussian  smoothing  filter  with  the  standard  deviation  cr,  and  A  is  a  suitably 
chosen  constant.  Solving  the  problem  of  snakes  is  to  find  the  contour  C  that  minimizes  the 
total  energy  term  E  with  the  given  set  of  weights  <a,  /?,  and  A.  In  numerical  experiments,  a  set 
of  snake  points  residing  on  the  image  plane  are  defined  in  the  initial  stage,  and  then  the  next 
position  of  those  snake  points  are  determined  by  the  local  minimum  E.  The  connected  form  of 
those  snake  points  is  considered  as  the  contour. 

The  classic  snakes  provide  an  accurate  location  of  the  edges  only  if  the  initial  contour  is 
given  sufficiently  near  the  edges  because  they  make  use  of  only  the  local  information  along  the 
contour.  Estimating  a  proper  position  of  initial  contours  without  prior  knowledge  is  a  difficult 
problem.  Also,  classic  snakes  cannot  detect  more  than  one  boundary  simultaneously  because 
the  snakes  maintain  the  same  topology  during  the  evolution  stage.  That  is,  snakes  cannot  split 
to  multiple  boundaries  or  merge  from  multiple  initial  contours.  Level  set  theory  [3]  has  given  a 
solution  for  this  problem. 


3.2  Level  Set  Methods 

Level  set  theory,  a  formulation  to  implement  active  contours,  was  proposed  by  Osher  and 
Sethian  [3].  They  represented  a  contour  implicitly  via  a  two-dimensional  Lipschitz-continuous 
function  y)  :  ft  — ►  3?  defined  on  the  image  plane.  The  function  <j){x,  y)  is  called  level  set 
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function ,  and  a  particular  level,  usually  the  zero  level,  of  (j){x,  y)  is  defined  as  the  contour,  such 
as 

c  =  (0,  y)  :  <f)(x,  y )  =  0},  V(x,  y)  €  fi,  (3.7) 

where  ft  denotes  the  entire  image  plane.  Figure  3.1(a)  shows  the  evolution  of  level  set  function 
0(x,y),  and  figure  3.1(b)  shows  the  propagation  of  the  corresponding  contours  C.  As  the  level 
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Figure  3.1:  Level  set  evolution  and  the  corresponding  contour  propagation:  (a)  topological  view 
of  level  set  (j)(x,y )  evolution,  (b)  the  changes  on  the  zero  level  set  C  :  (/>(x,y)  =  0 


set  function  (j)(x,y)  increases  from  its  initial  stage,  the  corresponding  set  of  contours  (7,  i.e.  the 
red  contour,  propagates  toward  outside.2  With  this  definition,  the  evolution  of  the  contour  is 
equivalent  to  the  evolution  of  the  level  set  function,  i.e.  dC/dt  =  d<f>(x,  y)/dt.  The  advantage 
of  using  the  zero  level  is  that  a  contour  can  be  defined  as  the  border  between  a  positive  area 
and  a  negative  area,  so  the  contours  can  be  identified  by  just  checking  the  sign  of  cj)(x,y).  The 
initial  level  set  function  <fio(x,y)  :  ft  — >  3?  may  be  given  by  the  signed  distance  from  the  initial 
contour  such  as, 


4>o{x,y) 


{4>{x,y)  ■  t  =  0} 

±D((x,y),NXtV(C0))  ’ 


V(x,y)  €  Cl, 


(3.8) 


where  ±D(a,b)  denotes  a  signed  distance  between  a  and  6,  and  Nx^{Cq)  denotes  the  nearest 
neighbor  pixel  on  initial  contours  Co  =  C(t  —  0)  from  (x,y).  Figure  3.2(a)  shows  an  exam¬ 
ple  of  initial  contours  Co,  and  figure  3.2(b)  shows  the  initial  level  set  function  </>o (x,y)  as  the 
signed  distance  computed  from  the  initial  contour  Co-  c/)o(x,y )  increases,  i.e.  become  brighter, 
as  a  pixel  (x,  y)  is  located  further  inwards  from  the  initial  contours  Co,  while  0o(^5  y)  decreases, 
i.e.  become  darker,  as  the  pixel  is  located  further  outwards  from  the  initial  contours.  The  initial 
level  set  function  is  zero  at  the  initial  contour  points  given  by,  cf)o(x,y)  =  0,  M(x,y)  G  Cq. 


The  deformation  of  the  contour  is  generally  represented  in  a  numerical  form  as  a  PDE.  A 
formulation  of  contour  evolution  using  the  magnitude  of  the  gradient  of  0(x,  y)  was  initially 
proposed  by  Osher  and  Sethian  [69,  70,  3],  given  by 

—  =  | V0(s,  y) l(v  +  £k(0(x,  y)))  ,  (3.9) 

2In  figure  3.1,  the  amount  of  level  set  evolution  is  set  as  a  constant  along  the  entire  domain,  d<j>(x,y) / dt  =  c, 
V(x,y)  G  P,  for  easy  understanding.  Normally,  d<j>(x,y)/dt  is  a  function  of  spatial  coordinates  (x,y). 
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Figure  3.2:  Initial  contours  and  corresponding  signed  distance:  (a)  the  initial  contour  Co,  (b) 
the  initial  level  set  function  fro (x,y)  determined  by  the  signed  distance  ±D((x,  y),  Nx,y(Co)) 


where  v  denotes  a  constant  speed  term  to  push  or  pull  the  contour,  k,(-)  :  ft  — ►  9?  denotes  the 
mean  curvature  of  the  level  set  function  fr(x,y)  given  by 

n{fr{x,y))  =  div 

frxxfr^  ~  ‘Zfrxfryfrxy  +  fr yy fr^  /« 

(€  +  <%)*'*  ’  ^  j 

where  <frx  and  frxx  denote  the  first-  and  second-order  partial  derivatives  of  fr(x,  y)  respect  to  x, 
and  fry  and  (fryy  denote  the  same  respect  to  y.  The  role  of  the  curvature  term  is  to  control  the 
regularity  of  the  contours  as  the  internal  energy  term  Eint  does  in  the  classic  snakes  model,  and 
e  controls  the  balance  between  the  regularity  and  robustness  of  the  contour  evolution. 


Another  form  of  contour  evolution  was  proposed  by  Chan  and  Vese  [9,  71].  The  length  of 
the  contour  |C|  can  be  approximated  by  a  function  of  fr(x,y)  [72,  73],  such  as 


|C|  «  Le(fr(x,y))  =  [  \VHe(fr(x,y))\dxdy 

Jn 

=  [  Se(cf)(x,y))\V(f)(x,y)\dxdy , 


(3.11) 


where  He(-)  denotes  the  regularized  form  of  the  unit  step  function3  H(-):  ft  — >  3?  given  by 

1,  if  fr{x,y)  >  0 


H(x,y)  = 


0,  if  fr(x,  y)  <  0 


,  V(x,y)  G  ft, 


(3.12) 


and  Se(-)  denotes  the  derivative  of  He(-).  Since  the  unit  step  function  produces  either  0  or 
1  depending  on  the  sign  of  the  input,  the  derivative  of  the  unit  step  function  produces  non¬ 
zero  only  where  <fr(x,  y)  =  0,  i.e.  on  the  contour  C.  Consequently,  the  integration  shown  in 
equation  3.11  is  equivalent  to  the  length  of  contours  on  the  image  plane.  The  associated  Euler- 
Lagrange  equation  [74]  obtained  by  minimizing  Le(*)  with  respect  to  <fr  and  parameterizing  the 

3 This  unit  step  function  is  often  referred  to  Heaviside  function. 
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(3.13) 


descent  directions  by  an  artificial  time  t  is  given  by 

—  =  Se{<f>(x,y))K(<f>(x,y)). 

The  contour  evolution  motivated  by  the  equation  above  can  be  interpreted  as  the  motion  by 
mean  curvature  minimizing  the  length  of  the  contour.  Therefore,  equation  3.9  is  considered  as 
the  motion  motivated  by  PDE,  while  equation  3.13  is  considered  as  the  motion  motivated  by 
energy  minimization. 

An  outstanding  characteristic  of  level  set  methods  is  that  contours  can  split  or  merge  as  the 
topology  of  the  level  set  function  changes.  Therefore,  level  set  methods  can  detect  more  than 
one  boundary  simultaneously,  and  multiple  initial  contours  can  be  placed.  Figure  3.3(a)  shows 
an  example  of  the  topological  changes  on  a  level  set  function,  while  figure  3.3(b)  shows  how  the 
initially  separated  contours  merge  as  the  topology  of  level  set  function  varies.  This  flexibility 


Figure  3.3:  The  change  of  topology  observed  in  the  evolution  of  level  set  function  and  the 
propagation  of  corresponding  contours:  (a)  the  topological  view  of  level  set  (j){x,  y)  evolution, 
(b)  the  changes  on  the  zero  level  set  C  :  (f>(x,y)  —  0 

and  convenience  provide  a  means  for  an  autonomous  segmentation  by  using  a  predefined  set  of 
initial  contours.  The  computational  cost  of  level  set  methods  is  high  because  the  computation 
should  be  done  on  the  same  dimension  as  the  image  plane  ft.  Thus,  the  convergence  speed  is 
relatively  slower  than  other  segmentation  methods,  particularly  local  filtering  based  methods. 
The  high  computational  cost  can  be  compensated  by  using  multiple  initial  contours.  The  use 
of  multiple  initial  contours  increases  the  convergence  speed  by  cooperating  with  neighbor  con¬ 
tours  quickly.  Level  set  methods  with  faster  convergence,  called  fast  marching  methods  [75], 
have  been  studied  intensively  for  the  last  decade.  Because  of  these  attractive  properties,  we 
implement  the  proposed  active  contour  model  using  the  level  set  method. 


3.3  Edge-based  Active  Contours 

Edge-based  active  contours  are  closely  related  to  the  edge-based  segmentation.  Most  edge-based 
active  contour  models  consist  of  two  parts:  the  regularity  part,  which  determines  the  shape  of 
contours,  and  the  edge  detection  part,  which  attracts  the  contour  towards  the  edges. 


17 


Geometric  active  contour  model  was  proposed  by  Caselles  et  al.  [79]  adding  an  additional 
term,  called  stopping  function ,  to  the  speed  function  shown  in  equation  3.9.  It  was  the  first 
level  set  implemented  active  contour  model  for  the  image  segmentation  problem.  Malladi  et 
al.  [80,  76]  proposed  a  similar  model  given  by 

—  =  g(Vx,  y))  +  OIWO,  y)  |,  (3.14) 

where  g(-)  :  Q  — ►  3?  denotes  the  stopping  function,  i.e.  a  positive  and  decreasing  function  of 
the  image  gradient.  A  simple  example  of  the  stopping  function  is  given  by 


g(Hx,y)) 


1 

1  +  \VI(x,  y)\n 


(3.15) 


where  n  is  given  as  1  in  [79]  and  2  in  [80].  Note  that  \VI(x,y)\  can  be  interchangeably  used 
with  Eimg  shown  in  equation  3.6.  The  contours  move  in  the  normal  direction  with  a  speed  of 
g{I(x ,  y))  +  z/),  and  therefore  stops  on  the  edges,  where  g(-)  vanishes.  The  curvature 

term  ft(-)  maintains  the  regularity  of  the  contours,  while  the  constant  term  v  accelerates  and 
keeps  the  contour  evolution  by  minimizing  the  enclosed  area  [81]. 


Geodesic  active  contour  model  was  proposed  by  Caselles  et  al.  [82,  83]  after  the  geometric 
active  contour  model.  Kichenassamy  et  al.  [84]  and  Yezzi  et  al.  [85]  also  proposed  a  similar 
active  contour  model.  Based  on  the  principle  of  the  classic  dynamic  systems,  solving  the  active 
contour  problem  is  equivalent  to  finding  a  path  of  minimal  distance,  called  geodesic  curve  [86] 
given  by 

dC 

—  =  (, g(I(x,y))K((f)(x,y ))  -  Vg(I(x,y))  (3.16) 

where  J\f  denotes  the  inward  unit  normal4  given  by 


Vcf) 


II  w 


(3.17) 


From  the  relation  between  a  contour  and  a  level  set  function  and  the  level  set  formulation  of 
the  steepest  descent  method,  solving  this  geodesic  problem  is  equivalent  to  searching  for  the 
steady  state  of  the  level  set  evolution  equation  [82,  87]  given  by 


—  =  g(I(x,y))(K(<l>(x,y))  +  i/)\V<l>(x,y)\  +  Vg(I(x,y))  ■  V</>(x,y),  (3.18) 

where  v  is  an  additional  speed  term  to  accelerate  the  evolution.5  The  equivalence  between  clas¬ 
sic  snakes  and  geodesic  active  contours  has  been  also  shown  by  other  authors  [88,  89,  90,  91]  in 
slightly  different  views.  We  can  notice  that  the  geodesic  active  contour  model  shown  in  equa¬ 
tion  3.18  is  identical  to  the  geometric  active  contour  model  shown  in  equation  3.14  except  for 
the  advection  term  \7g(I(x,y))  •  V(j)(x,y).  Geodesic  active  contours  have  been  the  most  pop¬ 
ular  methods  among  the  edge-based  active  contour  models,  and  their  applications  have  been 
extended  to  multispectral  images  by  Sapiro. 


Color  snakes  is  the  geodesic  active  contour  model  particularly  for  multispectral  images 
proposed  by  Sapiro  [86,  92,  93,  87].  In  order  to  detect  the  edges  in  multispectral  images,  a 

4 Some  authors  define  J\f  as  the  outward  unit  normal  with  an  opposite  sign. 

5  The  level  set  function  still  slowly  converges  even  without  v. 
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special  gradient  function  based  on  Riemannian  geometry,  called  color  gradient  function,  is  used 
instead  of  the  traditional  image  gradient  function.  A  simple  example  of  the  color  gradient 


function  is  given  by 


g(J(x,y)) 


1 

1  +  (A+  -  A-)  ’ 


(3.19) 


where  A+  and  A_  respectively  present  the  maximal  and  the  minimal  rate  of  changes  on  the 
multispectral  image  I (x,y).  A+  and  A_  are  the  eigenvalues  of  the  metric  tensor  [94]  given  by 


i  r  (9h\2 

9u  9 12  _  \dx  )  dx  dy 

921  922\~  ^  9hdlb  f&Lb']2 
°  L  dy  dx  y  dy  J 


(3.20) 


where  denotes  the  6-th  band  of  the  multispectral  image  I(x,  y ).  In  the  case  of  scalar  images, 
i.e.  gray  images,  A+  =  |V/|2  and  A_  =  0,  so  the  stopping  function  shown  in  equation  3.19 
become  identical  to  equation  3.15.  Sapiro  also  introduced  color  self-snakes  [86,  95,  93,  87], 
which  diffuse  the  image  in  the  same  way  that  we  evolve  the  level  set  function.  This  is  an  edge¬ 
preserving  smoothing  method  like  MFA  or  image  diffusion  using  active  contours. 


Due  to  the  structure  of  the  speed  functions  and  the  stopping  functions  in  equation  3.9 
and  equation  3.18,  edge-based  active  contour  models  have  a  few  disadvantages  compared  to  the 
region-based  active  contour  models,  discussed  in  the  next  section.  Because  of  the  constant  term 
z/,  edge-based  active  contour  models  evolve  the  contour  towards  only  one  direction,  either  inside 
or  outside.  Therefore,  an  initial  contour  should  be  placed  completely  inside  or  outside  of  ROI, 
and  some  level  of  a  prior  knowledge  is  still  required.  Also,  edge-based  active  contours  inherit 
some  disadvantages  of  the  edge-based  segmentation  methods  due  to  the  similar  technique  used. 
Since  both  edge-based  segmentation  and  edge-based  active  contours  rely  on  the  image  gradient 
operation,  edge-based  active  contours  may  skip  the  blurry  boundaries,  and  they  are  sensitive 
to  local  minima  or  noise  as  edge-based  segmentation  does.  Gradient  vector  flow  fast  geodesic 
active  contours  [96,  97]  proposed  by  Paragios  replaced  the  edge  detection  (boundary  attraction) 
term  with  gradient  vector  field  [98,  99,  100,  101,  102],  that  refers  to  a  spatial  diffusion  of  the 
boundary  information  and  guides  the  propagation  to  the  object  boundaries  from  both  sides,  to 
give  more  freedom  from  the  restriction  of  initial  contour  position. 


3.4  Region-based  Active  Contours 

Most  region-based  active  contour  models  consist  of  two  parts:  the  regularity  part,  which  de¬ 
termines  the  smooth  shape  of  contours,  and  the  energy  minimization  part,  which  searches  for 
uniformity  of  a  desired  feature  within  a  subset.  A  nice  characteristic  of  region-based  active 
contours  is  that  the  initial  contours  can  be  located  anywhere  in  the  image  as  region-based 
segmentation  relies  on  the  global  energy  minimization  rather  than  local  energy  minimization. 
Therefore,  less  prior  knowledge  is  required  than  edge-based  active  contours. 

Piecewise- constant  active  contour 6  model  was  proposed  by  Chan  and  Vese  [9,  71]  using  the 
Mumford-Shah  segmentation  model  [105,  106].  Piecewise-constant  active  contour  model  moves 

6  Although  the  titles  of  [9,  71]  are  “Active  Contours  without  Edges,”  all  region-based  active  contours  introduced 
in  this  section  also  do  not  use  edge  information.  Thus,  we  refer  to  this  method  as  piecewise-constant  active  contour 
model  through  this  document. 
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deformable  contours  minimizing  an  energy  function  instead  of  searching  edges.  A  constant  ap¬ 
proximates  the  statistical  information  of  image  intensity  within  a  subset,  and  a  set  of  constants, 
i.e.  a  piecewise-constant,  approximate  the  statistics  of  image  intensity  along  the  entire  domain 
of  an  image.  The  energy  function  measures  the  difference  between  the  piecewise-constant  and 
the  actual  image  intensity  at  every  image  pixel.  The  level  set  evolution  equation  is  given  by 

~  =  Se(<f>(x,y))[i/K(<f>(x,y))  -  {(I(x,  y)  -  mf  -  (. I{x,y )  —  /x0)2}],  (3.21) 

where  /io  and  fi\  respectively  denote  the  mean  of  the  image  intensity  within  the  two  subsets, 
i.e.  the  outside  and  inside  of  contours.  The  final  partitioned  image  can  be  represented  as  a  set 
of  piecewise-constants,  where  each  subset  is  represented  as  a  constant.  This  method  has  shown 
the  fastest  convergence  speed  among  region-based  active  contours  due  to  the  simple  represen¬ 
tation.  Lee  et  al.  [107]  showed  an  improvement  to  the  piecewise-constant  active  contour  model 
on  illuminated  images  by  proposing  an  alternative  energy  function. 


Piecewise-smooth  active  contour  model,  an  extension  of  piecewise-constant  model  using  a 
set  of  smoothed  partial  images,  was  also  proposed  by  Chan  and  Vese  [108,  109,  110,  111,  112]. 
The  same  segmentation  principles  used  for  piecewise-constant  model  partitions  an  image,  but 
a  smoothed  partial  images  instead  of  constants  represent  each  subset.  The  level  set  evolution 
equation  is  given  by 


dfax^y) 

dt 


5e(<p(x,y)) 


vk(4>(x,  y))  -  {( I(x,y )  -  m(x,y))2  -  (. I(x,y )  -  y0(x,y))2} 
m{x,y)\2  -  |V/x0(x,y)|2) 


(3.22) 


where  (io(x,y)  and  respectively  denote  the  smoothed  images  within  the  outside  and 

inside  of  contours.  The  segmentation  result  of  piecewise-smooth  active  contours  is  similar  to 
the  segmentation  by  color  self-snakes  because  of  the  similar  approach. 


Although  traditional  region-based  active  contours  partition  an  image  into  multiple  sub¬ 
regions,  those  multiple  regions  belong  to  only  two  subsets:  either  the  inside  or  the  outside 
of  contours.  Chan  and  Vese  proposed  multi-phase  active  contour  model7  [108,  74,  113,  114, 
115],  which  increases  the  number  of  subsets8  that  active  contours  can  find  simultaneously. 
Multiple  active  contours  evolve  independently  based  on  the  piecewise-constant  model  shown  in 
equation  3.21  or  the  piecewise-smooth  model  shown  in  equation  3.22,  and  multiple  subsets  are 
defined  by  a  group  of  disjoint  combination  of  the  level  set  functions.  For  example,  N  level  set 
functions  define  maximum  2N  subsets  of  the  entire  region.  An  example  of  subsets  defined  by 
4-phase  active  contours  is 

flo 

^2 

7Here,  the  term  multi  refers  to  more  than  two  instead  of  more  than  one.  Two  terms  phase  and  subset  are 
considered  identical  in  the  image  segmentation  problem,  and  used  interchangeably  in  this  document. 

8  A  sub-region  refers  to  a  group  of  image  pixels  that  have  similar  property  and  reside  within  the  same  bound¬ 
aries.  A  subset  refers  to  a  group  of  the  subregions  that  have  similar  property  but  does  not  necessarily  reside 
within  the  same  boundary. 


r 

III 

TT 

i - 

fa(x,y)  <  0,fa(x,y)  <  0 
fa (x,y)  <  0,fa(x,y )  >  0 
fa(x,y)  >  O,0i (x,y)  <  0 
fa (x,y)  >  0,0!  (x,y)  >  0 


(3.23) 
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where  {Do,  Di,  D2,  D3}  denote  the  four  subsets  defined  by  two  level  set  functions  {</>i,</>2}, 
i.e.  two  active  contours.  The  level  set  evolution  equation  for  this  case  is  given  by 


d^ijx/y) 

dt 


Se(Mx,y)) 


|i//c((/>i(x,y)) 


d<t>2{x,y ) 
dt 


Se(Mx,y)) 


|i/«;((/>2(x,y)) 


f  (J(x,y)-y 3)2- 

1  (I(x,y)-y,  2)2 

(I(x,y)  -  in)2-  1 
(I(x,y)  -  yo)2  J 


H2+ 


(1  -  H2) 


j  (i(x,y)~v 3)2- 
l  (I(x,y)-n  1)2 
(I(x,y)  -  y2)2-  1 
(I(x,y)-y  o)2  J 


Hi+ 


(1  -  Hi) 


}  ,(3.24) 


where  Hn  =  He((f)n(x,y))  and  {/io,  Mi,  M2,  M3}  denote  the  mean  of  image  intensity  within 
the  corresponding  subsets  {Do,  Di,  D2,  D3}.  Rousson  and  Deriche  [116,  10,  117]  and  Yezzi  et 
al.  [118,  119]  proposed  a  similar  multi-phase  active  contour  model  for  segmentation  problems, 
and  Samson  et  al.  [120,  121,  122]  also  proposed  a  multi-phase  active  contour  model  for  pattern 
classification  problems.  Multi-phase  active  contours  provide  a  means  to  integrate  segmentation 
and  pattern  classification  tasks,  m-phase  active  contours  partition  the  image  into  multiple 
sub- regions  (^>  m),  and  they  simultaneously  identify  those  regions  into  m-subsets,  i.e.  classes. 
Depending  whether  training  samples  are  provided  or  not,  supervised  or  unsupervised  segmen¬ 
tation  can  actually  perform  supervised  or  unsupervised  pattern  classification.  This  provides 
a  way  to  the  autonomous  pattern  classification  technology  reducing  the  number  of  procedures 
and  processing  time. 


The  same  segmentation  principle  can  be  extended  to  multispectral  images  by  taking  the 
mean  of  energy  functions  measured  at  each  band  [5,  123].  The  level  set  evolution  equation  of 
2-phase  active  contour  model  is  given  by 

—  =  5e(<f)(x,  y))[vK(<f>(x,  y))  -  Y^{ujlb(Ib(x,  y)  -  ylb)2  -  uob( Ib(x,  y)  -  y0b)2}],  (3.25) 

where  \ii  —  [Mil,  Mi 2, . . . ,  Mi6, . . . ,  MiB]T  denotes  the  mean  vector  of  the  vector  valued  image  in¬ 
tensity  I (x,y)  within  the  corresponding  subset  D^. 

Rousson  and  Deriche  proposed  a  variational  formulation  obtained  from  a  Bayesian  segmen¬ 
tation  model  [116,  10,  117].  While  the  piecewise-constant  active  contour  model  uses  a  group  of 
constants  to  represent  subsets,  this  method  implicitly  uses  a  conditional  PDF  of  a  given  value 
I (x,y)  with  respect  to  the  hypothesis,  i.e.  a  unimodal  (multivariate)  Gaussian  distribution, 
given  by 

p(l(s>j/)|{/*i>s*})  =  (27r)rf/2|S.|i/2exp  (-^(I(x,y)  -^)TS“1(I(x,y)  -n)j  ,  (3.26) 

where  and  are  the  parameters  of  a  (multivariate)  Gaussian  distribution.  Taking  the 
negative  log  form  of  the  PDF  as  an  energy  function,  the  level  set  evolution  equation  minimizing 
the  energy  function  is  given  by 

—  =  5£(<j>(x,y))[i/K(<j)(x,y))  -  {ex (x,y)  -  e0{x,y)}\  ,  (3.27) 
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where  the  objective  functions  ei  are  given  by 

e;0,y)  =  —  logp(I(x,  y)\Hi,  Sj) 

-  log|Si|  +  (l(x,y)  -  ^i)TSr1(I(x,y)  -  ^).  (3.28) 

The  extension  of  this  method  for  multispectral  images  is  trivial  as  the  energy  function  is  based 
on  the  multivariate  PDF.  A  unique  feature  of  this  method  is  that  multivariate  PDF  consider 
each  band  of  multispectral  images  as  independent  dimension  of  an  image  intensity  space.  Mul¬ 
tivariate  PDF  does  not  make  big  improvement  in  this  active  contour  model,  but  it  does  make 
a  significant  improvement  in  the  proposed  active  contour  models  using  a  mixture  density  func¬ 
tion.  More  detail  of  the  improvement  will  be  discussed  in  chapter  5.2. 

Due  to  the  global  energy  minimization,  region-based  active  contours  generally  do  not  have 
any  restriction  on  the  placement  of  initial  contours.  That  is,  region-based  active  contour  can 
detect  interior  boundaries  regardless  of  the  position  of  initial  contours.  The  use  of  pre-defined 
initial  contours  provides  a  method  of  autonomous  segmentation.  Also,  they  are  less  sensitive 
to  local  minima  or  noise  than  edge-based  active  contours.  However,  due  to  the  assumption 
of  uniform  image  intensity,  most  methods  are  applicable  only  to  images  where  each  subset  is 
representable  by  a  simple  expression,  e.g.  single  Gaussian  distribution  or  a  constant.  If  a  sub¬ 
set,  i.e.  class,  consists  of  multiple  distinctive  sub-classes,  e.g.  the  zebra  shown  in  figure  1.3(b), 
these  methods  would  produce  over-segmented  or  under-segmented  results.  We  propose  novel 
region-based  active  contour  models  which  produce  improved  results  using  multivariate  mixture 
density  functions. 


3.5  Active  Contours  Integrating  Edge-  and  Region-based  Seg¬ 
mentation 


In  order  to  improve  the  segmentation  performance,  the  integration  of  edge-  and  region-based 
information  sources  using  active  contours  has  been  proposed  by  a  few  authors.  Geodesic  active 
region  is  a  supervised  active  contour  model,  proposed  by  Paragios  [124,  125,  126],  integrating 
edge-  and  region-based  segmentation  module  in  an  energy  function.  A  statistical  analysis 
based  on  the  Minimum  Description  Length  (MDL)  criterion  and  the  Maximum  Likelihood 
(ML)  principle  for  the  observed  density  function,  i.e.  an  image  histogram,  indicates  the  number 
of  sub-regions  and  the  statistical  PDF  within  those  sub-regions  using  a  mixture  of  Gaussian 
elements.  Regional  probability  is  estimated  from  the  statistical  PDF  based  on  prior  knowledge, 
i.e.  training  samples.  Then,  the  boundary  information  is  determined  by  a  probabilistic  edge 
detector ,  estimated  from  the  regional  probabilities  of  neighborhood  [127,  128].  For  example, 
an  image  pixel  is  more  likely  an  edge  pixel  if  the  neighborhood  pixels,  located  on  the  opposite 
sides,  have  high  regional  probabilities  for  a  different  class.  The  level  set  evolution  equation  is 
given  by 


dt 


\X7<f>(x,y)\ 


aJ2b^b  log 


pi(Ib(x,y))\ 

p2(Ib(x,y)) ) 


+ 


(1  -  a)  { g(pB(x ,  v))k(Hx ,  y ))  +  \/g(pe(x,  y )) 


Y^jxgj)  \ 
IW(x,3/)|  / 


(3.29) 


where  Ib(x,y)  denotes  6-th  band  of  a  multispectral  image  I(cc,  y).  and  pi(Ib(x,y))  denotes  the 
regional  probability  presenting  a  probability  that  a  pixel  Ib(x,  y)  belongs  to  sub-region  i,  pe(x,  y) 
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denotes  the  probabilistic  edge  detector  presenting  a  probability  that  a  boundary  pixel  is  located 
at  (x,y),  and  g(pe)  denotes  a  positive  and  decreasing  function  of  the  probability.  The  regional 
probability  is  computed  from  each  band  and  accumulated.  A  detail  to  determine  both  proba¬ 
bilities,  pi(Ib(x,y))  and  pe(x,y),  is  explained  in  [128].  The  geodesic  active  region  model  is  later 
applied  to  a  medical  imaging  problem  [129,  130]  with  a  gradient  vector  flow-based  boundary 
component.  The  approach  was  based  on  a  coupled  propagation  of  two  active  contours,  and 
integrates  visual  information  with  anatomical  constraints. 

Jehan-Besson  et  al.  also  proposed  an  active  contour  model  [131,  132]  minimizing  an  en¬ 
ergy  criterion  involving  both  region  and  boundary  functionals.  These  functionals  are  derived 
through  a  shape  derivative  approach  instead  of  classical  calculus  of  variation.  They  focus  on 
statistical  property,  i.e.  the  PDF  of  the  color  histogram  of  a  sub-region.  Active  contours  are 
propagated  minimizing  the  distance  between  two  histograms  for  matching  or  tracking  purposes. 
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Chapter  4 

Region-based  Segmentation  using 
Active  Contours 


Region-based  segmentation  looks  for  uniformity  within  a  sub-region  based  on  a  desired  feature, 
e.g.  intensity,  color,  and  texture.  Region-based  active  contour  models  have  shown  attractive 
characteristics,  such  as  the  unrestricted  position  of  initial  contours,  the  automatic  detection  of 
interior  boundaries,  and  reasonable  segmentation  due  to  global  energy  minimization  though  the 
segmentation  results  are  still  case  dependent.  Region-based  active  contours  evolve  deformable 
shapes  based  on  two  forces:  energy  minimization  based  on  the  statistical  properties,  which  pur¬ 
sues  the  uniformity  within  each  subset,  and  curvature  motion  motivated  by  level  set  function, 
which  keeps  the  regularity  of  active  contours.  In  this  chapter,  we  propose  a  novel  segmentation 
principle  based  on  regional  energy  minimization  using  the  statistics  of  image  intensity.  First, 
we  redefine  the  terminology  used  in  the  image  segmentation  problem  in  section  4.1.  Then, 
we  discuss  the  base  segmentation  model,  proposed  by  Mumford-Shah,  of  the  proposed  active 
contour  model  in  section  4.2.  Finally,  we  discuss  the  proposed  segmentation  principle  and  its 
implementation  in  section  4.3. 


4.1  Image,  Subset,  and  Sub-region 

Let  us  redefine  the  notation  of  terms  used  in  our  segmentation  model.1  As  introduced  in 
chapter  1,  an  image  I (x,y)  is  the  native  input  data  of  the  image  analysis,  given  as  a  form  of 
a  function  defined  on  a  two-dimensional  spatial  domain.  We  define  a  multispectral  image  as 
a  general  form  of  images  and  a  scalar  image  as  a  particular  case  of  multispectral  images.  A 
multispectral  image  I(x,  y)  can  be  defined  as  a  set  of  vectors  given  by 

I(z, y)  =  [h(x, y),I 2(x, y),..., Ih{x, y),..., IB(x, y)]T  (4.1) 

where  Ib(x,  y )  denotes  a  scalar  image  measured  at  band  b.  Let  the  vector- valued  image  intensity 
of  I(x,  y )  be  a  multi-dimensional  random  variable  I  E  ?RB  where  B  denotes  the  dimension  of  I 
and  is  equivalent  to  the  number  of  optical  bands  measured. 

Let  4/  represent  the  entire  region  of  an  image  I(x,  y).  Image  segmentation  is  a  task  to  par¬ 
tition  the  entire  region  4/  into  n  sub-regions,  {4>i,  4>2, . . . ,  4q, . . . ,  4/n},  with  the  criteria  shown 
in  table  4.1.  C{  denotes  the  boundary  wrapping  sub-region  4q.  The  first  and  second  condi- 

1  These  notations  are  not  necessarily  identical  to  the  terms  introduced  in  other  publications. 
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Table  4.1:  The  criteria  of  general  image  segmentation 


1.  c  =  u  Ci. 

2=1 

2.  Ci  p|  Cj  /  0  if  and  Tj  are  neighbors. 

3.  *=  (0 

4.  I(x,y)  are  connected,  V(x,y)  E  \Eh,  Vi. 

5.  =  0>  if  J- 


tions  indicate  the  property  of  boundaries  wrapping  sub-regions  {\Eq}.  As  each  sub-region  has 
a  boundary,  the  boundaries  of  two  neighbor  sub-regions  are  overlapped.  C  denotes  the  entire 
set  of  boundaries.2  The  third  condition  indicates  that  the  segmentation  must  be  complete; 
that  is,  every  image  pixel  should  be  an  element  of  a  sub-region  \Eq  or  boundaries  C.  The  fourth 
condition  requires  that  all  image  pixels  in  a  sub-region  must  be  connected  in  a  predefined  sense; 
that  is,  they  should  be  located  at  the  inside  of  a  boundary.  The  fifth  condition  indicates  that 
the  sub-regions  must  be  disjoint  each  other,  so  an  image  pixel  should  be  an  element  of  only 
one  sub-region.  Here,  we  can  notice  the  difference  between  the  image  segmentation  problem 
and  the  pattern  classification  problem.  A  data  sample  can  be  a  member  of  multiple  classes  in 
pattern  classification,  but  an  image  pixel  should  be  a  member  of  only  one  sub-region  in  image 
segmentation. 

Table  4.1  lists  the  criteria  of  general  image  segmentation,  and  we  here  introduce  slightly 
different  criteria  for  region-based  segmentation.  Let  a  set  f 2,  instead  of  a  region  4/,  represent  the 
entire  domain  of  an  image  I (x,y).  The  region-based  image  segmentation  is  a  task  to  partition 
the  entire  set  ft  of  an  image  into  m  subsets,  {fli,  ft2,  •  •  • ,  fti,  •  •  • ,  with  the  criteria  shown  in 
table  4.2.  The  only  difference  between  a  subset  fti  and  a  sub-region  4/j  is  that  a  subset  ft{  does 


Table  4.2:  The  criteria  of  region-based  image  segmentation 

m 

1.  c  =  u 

2=1 

2.  Ci  p|  Cj  7^  0  if  fli  and  Qj  are  neighbors. 

3.  n=  (0 

4.  fj  Qj  =  0  for  Vi,  j  if  i  ^  j. 


not  necessarily  form  a  spatial  unit.  That  is,  Qi  may  contain  multiple  sub-regions  Tj  residing 
in  different  spatial  locations  on  the  entire  set  ft  of  an  image.  Following  expression  shows  the 
relation  between  n  subsets  and  m  sub-regions: 

=  (4.2) 

2 The  width  of  a  boundary  is  often  considered  as  infinitely  small  in  continuous  expression. 
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where  i  =  1,  2, . . . ,  m  and  j  =  1,2,...,  n3  (ft  =  4/)  denotes  the  entire  set  of  an  image  as  the 
largest  possible  spatial  unit,  while  (x,y)  denotes  an  image  pixel  as  the  smallest  possible  spatial 
unit.  Figure  4.1  shows  an  example  that  sub-regions  and  subsets  are  not  identical.  The  entire  set 


Figure  4.1:  Subsets  {£\b^i}  and  sub-regions  {4/q,  4>i,  4/2} 

of  the  image  (ft  =  4/)  consists  of  two  subsets  {flo5^i}  and  three  sub-regions  {4/o,  4>i,  4/2}.  A 
subset  fti  exists  at  two  different  spatial  locations,  where  each  of  them  is  independently  marked 
as  4>i  and  4>2-  Therefore,  two  main  approaches  of  segmentation,  i.e.  edge-  and  region-based, 
can  be  reintroduced  such  that  edge-based  segmentation  partitions  an  image  I (x,y)  into  multiple 
sub-regions  4/ j  searching  for  discontinuities  among  sub-regions,  while  region-based  segmenta¬ 
tion  partitions  an  image  I(x,  y)  into  multiple  subsets  fti  searching  for  uniformity  within  a  subset 

fV  ’  . 


4.2  The  Base  Segmentation  Model 

Mumford  and  Shah  [105,  106]  posed  the  image  segmentation  problem  as  a  variational  problem  to 
find  an  optimal  piecewise-smooth  approximation  f(x,  y )  of  the  given  scalar  image  /(x,  y )  and  a 
set  of  boundaries  C,  such  that  the  approximation  f(x,y)  varies  smoothly  within  the  connected 
components  of  the  subsets  excluding  the  boundaries,  i.e.  ft\C.  They  proposed  to  solve  the 
variational  segmentation  problem  by  minimizing  the  following  global  energy  function  [74] 

EMS(f,C)=  [  \I(x,y)  -  f(x,y)\2dxdy  +  12  [  |V/(x,  y)\2dxdy  +  v\C\,  (4.3) 

Jq  Jn\c 

with  respect  to  two  terms,  the  approximation  /  of  the  given  image  and  the  variational  bound¬ 
aries  C.  The  global  energy  function  EMS  consists  of  three  parts.  The  minimization  of  the  first 
part  approximates  the  image  I(x,y)  with  an  alternative  expression  f(x,y)  by  minimizing  the 
squared  difference  between  the  two  expressions  | I(x,  y)  —  f(x,  y)  |2.  The  second  part  piecewisely 
smoothes  /(x,  y)  by  minimizing  |V/(x,  y) |2  on  ft\C.  C  has  the  role  of  approximating  the  edges 
of  I(x,y).  The  third  part  smoothes  C  by  minimizing4  the  length  \C\.  The  existence  and  regu¬ 
larity  of  the  solution  of  the  problem  above  is  proven  in  [136,  105]. 


3According  to  the  definition,  m  <  n  in  general,  but  it  can  be  m<n  depending  on  the  image. 

4In  the  original  publication  [108,  105,  106],  the  integration  respect  to  one-dimensional  Hausdorff  measure  was 
used  instead  of  \C\.  The  equivalence  between  two  expressions  \C\  —  fc  dTi 1  is  shown  in  [74]. 
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The  global  piecewise  approximation  /(x,  y)  can  be  represented  as  a  sum  of  sub-approximations 
given  by 

f(x,  y)  =  Yl  y ),  V(x,  y)  €  Q,  (4.4) 

i 

where  fi  approximates  the  given  multispectral  image  I(x,  y)  within  and  Xi(xi  y)  :  ^  {0, 1} 

is  determined  by  the  spatial  domain  of  such  as 


Xi{x,y) 


1,  V(x,y)  €  Qi  1  v. 
0,  V(x,y)  ^  O*  J  ’  * 


(4.5) 


Note  that  is  not  necessarily  an  image  function  but  any  expression  that  represents  the  feature, 
used  as  the  region-based  segmentation  criteria,  of  the  image  within  fV 


The  global  energy  function  EMS(f ,  C)  given  in  equation  4.3  can  be  simplified  and  generalized 
by  ignoring  the  smoothing  term  and  defining  independent  objective  functions  for  each  subset, 
such  that 

E({fi}’C)  =  N  [  e(x,y\fi)dxdy +  v\C\,  (4.6) 

i  M 

where  the  variational  contour  C  determines  the  domain  of  variational  subsets  The  objective 
function  e{x,y\fi)  :  D  — >  5ft  determines  the  condition  of  region-based  segmentation  within  each 
Qi,  e.g.  the  uniformity  of  image  intensity  I,  and  how  well  the  approximation  fi  represents  the 
given  image.  Better  fi  results  in  lower  e  for  each  consequently  lower  E.  The  minimum  of  E 
is  achieved  by  two  sequential  minimization  procedures.  First,  the  minimization  of  E  with  re¬ 
spect  to  each  of  /^,  while  C  is  fixed,  finds  the  best  representations  of  each  of  the  given  image 
I (x,y)  minimizing  the  objective  function  e{x,y\fi).  Then,  the  minimization  of  E  with  respect 
to  C,  while  {fi}  are  fixed,  smoothes  the  variational  boundaries  C  minimizing  \C\.  Combination 
of  these  two  minimizations  leads  to  the  region-based  active  contour  evolution,  which  moves  a 
set  of  contours  C  satisfying  the  segmentation  constraints  on  the  given  image. 


Depending  on  the  objective  function  e(x,y\fi)  and  representation  /^,  various  active  contour 
models  can  be  achieved  from  the  global  energy  function  shown  in  equation  4.6.  The  energy 
function  of  piecewise-constant  active  contour  model  [5]  can  be  transformed  to  equation  4.6  with 
an  objective  function 


1  X  a 

e(x,  y\fi  =  =  —  ^2(Ib(x,  y)  -  //i>6)2,  \/(x,  y)  e  fij,  Vi,  (4.7) 

13  6=1 

where  fi  is  given  as  a  vector  fi{  =  [/ii, . . . ,  /i^, . . . ,  /i#]T  C  5ftB.  The  optimal  fi  minimizing  E  is 
the  mean  vector  of  I (x,y)  within  fV  The  energy  function  of  [10]  also  can  be  transformed  to 
the  same  form  with  an  objective  function 

e(x,y\fi  =  {/x^Sj})  =  log|£j|  +  (I(x,y)  -  ni)T'Er1(I(x,y)  -  ^),  V(z,  y)  €  Vi,  (4.8) 

where  fi  is  given  as  a  set  of  parameters,  i.e.  the  mean  vector  /Lq  and  covariance  matrix  5]^  of  a 
multivariate  Gaussian  PDF.  The  optimal  fi  minimizing  E  is  the  mean  vector  /x  and  covariance 
matrix  I]  of  I (x,y)  within  £V 
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4.3  Proposed  Region-based  Segmentation  Model 


The  objective  function  of  a  piecewise-constant  active  contour  model  shown  in  equation  4.7  mea¬ 
sures  the  average  Euclidean  distance  between  the  vector- valued  image  intensity  I (x,y)  and  the 
mean  vector  \ii  of  I (x,y)  within  a  subset  The  objective  function  shown  in  equation  4.8 
measures  the  Mahalanobis  distance  [22]  between  I (x,y)  and  weighted  by  the  covariance  5V 
Thus,  both  objective  functions  produce  the  minimum  if  I (x,y)  =  fi ^  and  produce  larger  val¬ 
ues  as  I (x,y)  and  /Lq  are  located  further  from  each  other  in  the  image  intensity  space.  The 
assumption  of  these  approaches  is  that  there  is  a  constant  value,  i.e.  fi  =  /Lq  or  fi  =  {/Lq, 
which  best  represents  and  consequently  the  entire  image  is  representable  as  a  piecewise- 
constant  expression  /.  However,  many  images  we  encounter  in  image  segmentation  problems, 
particularly  the  images  with  non-uniform  subsets  such  as  the  zebra  in  figure  1.3(b),  are  too 
complicated  to  be  represented  as  a  piecewise-constant  expression.  Therefore,  we  propose  an  ob¬ 
jective  function,  which  uses  a  conditional  PDF  of  I (x,y)  instead  of  a  distance  between  I (x,y) 
and  {fi}.  With  the  proposed  objective  function,  there  is  no  certain  best  representation  fi  but 
a  PDF  representation  pi  for  each  Depending  on  the  method  to  estimate  the  PDF,  there  are 
two  approaches,  unsupervised  and  supervised.  Unsupervised  method  partitions  images  looking 
for  uniform  statistics  of  image  intensity  within  a  subset,  while  supervised  method  does  the 
same  job  looking  for  similarity  between  the  statistics  of  image  intensity  and  the  corresponding 
training  samples  for  a  subset. 

If  an  image  pixel  is  likely  be  an  element  of  a  subset,  the  objective  function  should  produce 
a  lower  value  under  the  assumption  of  I (x,y)  G  fV  If  an  image  pixel  is  unlikely  be  an  element 
of  the  subset,  the  objective  function  should  produce  higher  value  under  the  same  assumption. 
The  objective  function 

e(x,y\pi)  = -log  pi,  M{x,y)<Efl,  Mi,  (4.9) 

provides  the  desired  feature  where  pi  denotes  an  unsupervised  multivariate  conditional  PDF 
of  vector- valued  image  intensity  I  on  the  condition  that  I (x,y)  is  an  element  of  given  by 

Pi  =  p(I(x,y)\(x,y)  e  Qi),  Mi.  (4.10) 

If  pi  is  fixed  as  a  Gaussian  distribution,  equation  4.9  is  equivalent  to  equation  4.8.  Note  that 
the  proposed  objective  function  may  have  multiple  local  maxima,  while  the  objective  functions 
in  equation  4.7  and  4.8  have  only  a  global  minimum.  The  corresponding  global  energy  function 
is  given  by 

E({pi},C)  =  W  [  e(x,y\pi)dxdy  +  v\C\.  (4.11) 

i  M 

The  minimum  of  E  is  achieved  by  a  similar  way  to  minimize  equation  4.6.  First,  instead  of 
minimizing  E  with  respect  to  each  /^,  we  estimate  the  conditional  probability  pi  from  the  given 
image,  which  is  equivalent  to  finding  the  best  fitting  fi .  The  estimated  pi  provides  a  force 
moving  the  active  contour  C  to  the  proper  position  where  C  divides  the  entire  domain  D  of  the 
image  I (x,y)  into  multiple  subsets  {f^},  i.e.  satisfying  the  segmentation  constraints.  Then,  the 
minimization  of  E  with  respect  to  C,  while  {pi}  are  fixed,  smoothes  the  variational  boundaries 
C  minimizing  \C\. 

If  some  level  of  prior  knowledge,  i.e.  training  samples,  of  an  object  is  available,  we  can 
estimate  the  conditional  PDF  p(I\Qi)  directly  from  the  training  samples,  such  that 

P(iia)«p(iia)=p(i|2i),  (4.i2) 
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where  Ii  —  {Ii, . . . ,  In, . . . ,  Iw}  denotes  the  ideally  sampled  training  data  or  a  statistical  tem¬ 
plate  of  the  object,  which  is  supposed  to  be  isolated  as  a  subset  in  the  result  of  the  seg¬ 
mentation.  p(Y\Ii)  :  — ►  3?  denotes  a  multivariate  conditional  PDF  of  a  vector-valued  image 

intensity  I  on  the  condition  that  the  multispectral  image  pixel  is  an  element  of  training  samples 
Ii.  An  objective  function  similar  to  equation  4.9  can  be  used  with  a  supervised  conditional 
PDF  given  by 

ei(x,y)  = -log p(I(x,y)\I(x,y)  eli),  V(a;,y)eft,  Vi,  (4.13) 

where  ei(x,y )  measures  the  similarity  between  the  statistics  of  I  and  2).  Since  the  best  fitting 
PDF  is  given  from  training  samples  instead  of  the  given  image,  the  minimization  of  E  with 
respect  to  pi  is  not  necessary  in  supervised  methods,  and  the  global  energy  function  shown  in 
equation  4.11  can  be  simplified  as  a  function  of  C  only,  such  as 

E{C)  =  V  f  ei(x,  y)dxdy  +  u\C\.  (4.14) 

*  M 

This  will  simplify  the  segmentation  procedure,  consequently  reducing  the  convergence  time  of 
active  contours.  Since  the  best  fitting  p(I|f^)  is  given,  the  segmentation  result  will  be  also  more 
robust  than  the  result  of  unsupervised  methods. 

Various  expressions  can  be  used  for  p(I|f^)  as  long  as  they  satisfy  the  unit  volume  condition: 

[  p(I\fli)  =  1,  or  Y>(I|a)  =  Vi.  (4.15) 

Ji  ! 

For  example,  either  a  parametric  continuous  PDF,  e.g.  a  Gaussian  density  function  p(I|/xi5  S^), 
or  a  non-parametric  discrete  PDF  can  be  used  as  long  as  they  are  integrated  as  one.  The  details 
of  proposed  model  for  p(I|f^)  will  be  discussed  in  chapter  5,  and  the  estimation  methods  will 
be  discussed  in  chapter  6. 
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Chapter  5 

Probability  Density  Model 


The  objective  of  region-based  segmentation  is  to  partition  the  entire  set  of  an  image  into 
multiple  disjoint  subsets  {f^}  based  on  the  uniformity  of  a  desired  feature  within  each  Un¬ 
fortunately,  image  intensity  within  a  subset  is  not  always  uniform  in  practical  applications. 
Subsets  often  form  a  mixture  of  multiple  sub-classes,  e.g.  the  zebra  shown  in  figure  1.3(b)  or 
the  camouflage  pattern  on  the  toy  tank  shown  in  figure  1.4(b).  In  this  chapter,  we  propose  a 
multivariate  mixture  density  function  as  the  statistical  model  of  vector- valued  image  intensity 
used  for  image  segmentation.  Section  5.1  discusses  the  need  and  use  of  a  mixture  density  func¬ 
tion  as  the  statistical  model  of  scalar  image  intensity.  Section  5.2  extends  the  use  of  mixture 
density  function  to  the  case  of  multispectral  images  comparing  the  multivariate  mixture  density 
function  and  the  product  of  marginal  density  functions  which  has  been  commonly  used  in  other 
segmentation  methods. 


5.1  Mixture  Density  Function 

The  assumption  of  segmentation  models  introduced  in  section  4.2  is  that  the  statistical  infor¬ 
mation  of  image  intensity  within  each  subset  is  representable  by  a  simple  expression  /^,  such 
as  a  vector-valued  constant  fi{  as  shown  in  equation  4.7  or  a  small  number  of  parameters 
Si}  as  shown  in  equation  4.8.  However,  the  statistical  distribution  of  image  intensity  of  a 
subset  is  often  unrepresentable  in  a  simple  form.  Figure  5.1  shows  an  example  of  an  image  with 
non-uniform  subsets  and  its  hand-segmented  result.  Figure  5.1(a)  shows  a  result  of  diffusion 
applied  on  figure  1.3(b).  The  background  is  quite  uniform,  and  the  statistical  distribution  (his¬ 
togram)  of  image  intensity  within  the  background  forms  a  unimodal  density  function  as  shown 
in  figure  5.2(b).  The  actual  histogram  of  the  background  presented  as  the  solid  line  can  be  rea¬ 
sonably  represented  with  a  Gaussian  density  function  presented  as  the  dotted  line.  However, 
the  zebra  consists  of  black  and  white  stripes,  and  the  statistical  distribution  of  image  intensity 
within  the  zebra  is  not  unimodal  as  shown  in  figure  5.2(a).  The  higher  peaks  located  about 
I  &  30  and  I  «  50  in  figure  5.2(a)  present  the  black  stripes  of  zebra,  and  the  relatively  lower 
peaks  located  about  /  ~  100  and  /  ~  150  present  the  white  stripes  of  zebra.  Thus,  a  Gaussian 
density  function  cannot  simply  represent  the  actual  statistical  distribution  (histogram)  of  image 
intensity  within  the  zebra  though  it  can  represent  the  background. 

We  propose  to  use  a  mixture  of  probability  density  functions  as  the  feature  represent¬ 
ing  the  statistical  property  of  image  intensity  /  within  a  subset  such  that  the  conditional 
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(a)  (b) 


Figure  5.1:  A  multimodal  image  and  its  ground  truth:  (a)  a  zebra,  (b)  the  hand-segmented 
image 


Zebra  Background 


(a)  (b) 

Figure  5.2:  A  multimodal  distribution  of  image  intensity  and  its  representation  using  a  unimodal 
Gaussian  distribution:  (a)  zebra  and  (b)  background  of  figure  5.1(a) 

PDF  of  /  on  the  condition  I(x,y)  E  Qi  is  given  by 

K 

p(I\£li)  =pj(I)  =  y^akPi(I\k),  Vi,  (5.1) 

k= l 

where  K  denotes  the  number  of  sub-classes  of  the  mixture,  and  pi(I\k)  :  5ft  — >  5ft  denotes  the 
conditional  PDF  of  /  on  the  condition  that  I(x,y)  is  an  element  of  and  I(x,y)  is  generated 
by  the  sub-class  k.  The  weights  {a^}  of  each  sub-class,  often  called  mixing  probabilities,  satisfy 

K 

2>*  =  1-  (5-2) 

k= 1 

Using  a  mixture  density  function,  the  histogram  shown  in  figure  5.2(a)  can  be  represented  as  a 
mixture  of  at  least  two  simple  parametric  density  functions. 


5.2  Multivariate  Mixture  Density  Function 

We  proposed  to  use  a  mixture  density  function  for  scalar  images  in  section  5.1.  How  about 
multispectral  images?  The  mixture  density  function  of  scalar  image  intensity  /  shown  in  equa- 
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tion  5.1  is  a  complicated  statistical  model,  and  a  mixture  density  function  of  vector- valued  image 
intensity  I  is  an  even  more  complicated  statistical  model.  For  example,  the  histogram  of  24bit 
RGB  image  intensity  requires  224  =  16777216  histogram  bins,  while  the  histogram  of  8bit  gray 
image  requires  256  histogram  bins.  Because  of  the  complexity  and  the  high  computational  cost, 
the  active  contour  models  employing  mixture  density  functions  [137,  124,  138,  139]  as  well  as 
traditional  region-based  active  contour  models  [140,  5]  have  used  alternative  expressions  rather 
than  dealing  with  multi-dimensional  image  intensity  space.  Based  on  the  assumption  that  each 
band  is  independent,  a  common  alternative  expression  of  p(I)  is  the  product  of  marginal  density 
functions  measured  at  each  dimension,  i.e.  spectral  bands,  such  as 

B 

g(I)  =  Hp(Ib).  (5.3) 

6=1 

If  p( I)  is  a  mixture  density  function,  the  equivalent  expression  can  be  expressed  as 

B  K 

,(d=iie^(w-  t5-4) 

6=1  k= 1 

Using  this  expression,  the  proposed  objective  functions  shown  in  equation  4.9  or  4.13  are  given 
by 


B  K 

ei(x,y)  =  -log  iie  akPi(Ib(x,y)\k),  M{x,y)  G  ft,  Vi, 

6  k= 1 
B  K 

=  akpi(Ih(x,y)\k).  (5.5) 

6  k= 1 

The  computation  of  g( I)  is  relatively  easier  than  p(I)  because  the  dimensionality  of  p(Ib)  is 
as  low  as  5ft  — ►  5ft  even  though  g(l)  and  p( I)  have  the  same  dimensionality  as  5ftB  — ►  5ft.  The 
expressions  similar  to  g( I)  and  logg(I)  have  been  used  in  a  few  other  active  contour  models 
designed  for  multispectral  images  [5,  140],  and  we  had  previously  proposed  a  similar  expres¬ 
sion  [139]  using  the  product  of  normalized  histograms  measured  at  each  band.  This  is  not  a  bad 
idea  because  the  statistical  meaning  of  p(T)  and  g( I)  are  equivalent  as  long  as  p(I)  is  a  unimodal 
distribution  as  shown  in  figure  5.3.  Figure  5.3(a)  presents  a  PDF  of  a  two-dimensional  random 
variable  I  G  5ft2,  which  is  distributed  by  a  two-dimensional  Gaussian  density  function  p(I|/x,  E). 
In  figure  5.3(b),  the  shades  on  the  walls  present  the  marginal  density  functions  of  I,  respectively 
given  by  p(I\)  and  p{1 2).  The  three-dimensional  curve  presents  the  product  of  p(Ib)  given  by 
g( I)  =  p(Ii)p(l2)-  As  the  two  expressions  p(I)  and  g(T)  are  equivalent,  we  can  alternatively  use 
g( I)  for  p(I)  in  this  case. 

Unfortunately,  g( I)  for  p(I)  are  not  equivalent  any  more  if  p(I)  is  a  mixture  density  function 
as  shown  in  figure  5.4.  Figure  5.4(a)  presents  a  PDF  of  I  G  5ft2,  distributed  by  a  mixture  of 
two  different  two-dimensional  Gaussian  functions  p( I)  =  (ap(I|/xl5  Ei)  +  (1  —  <a)p(I|/z2,  E2).  The 
shades  on  the  walls  present  p(Ii)  —  o^Pi(Ii)  +  (1  —  cx)p2{I\)  and  pfa)  —  apiih)  +  (1  —  a)p2(l2)i 
and  the  three-dimensional  curve  presents  g (I)  =  p(Ii)p(l2)  =  {api(Ii)  +  (l— ct)p2(Ii)}{&Pi(l2)  + 
(1  —  a)p2(l2)}-  As  we  can  see  in  figure  5.4(a)  and  5.4(b),  the  product  of  marginal  density 
functions  measured  at  each  dimension,  i.e.  g( I),  is  not  equivalent  to  the  true  PDF  p( I)  any 
more.  Although  there  are  only  two  modes  in  the  true  PDF  p( I),  there  are  four  modes  in  g( I). 
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(a)  (b) 


Figure  5.3:  A  unimodal  distribution  in  a  two-dimensional  image  intensity  space  and  its  recon¬ 
struction:  (a)  p( I),  (b)  g( I)  =  p(Ii)p(h) 


(a)  (b) 

Figure  5.4:  A  multimodal  distribution  in  a  two-dimensional  image  intensity  space  and  its  re¬ 
construction:  (a)  p( I)  =  api(l)  +  (1  -  a)p2(I)j  (b)  5(1)  =  {api(Ii)  +  (1  -  a)p2(h)}{api(l2)  + 
(1  -  a)p2{h)} 

The  two  excessive  modes  likely  result  in  classifying  the  out-class  image  pixels  as  the  given  class, 
equivalent  to  the  false  alarm  [141]  in  pattern  classification.  Since  f  g(I)  is  fixed  as  1,  the  two 
excessive  modes  result  in  reducing  the  power  of  the  two  correct  modes  in  g( I).  This  incident 
likely  results  in  excluding  the  in-class  image  pixels  from  the  given  class,  equivalent  to  the  mis- 
classification  [141]  in  pattern  classification.  Therefore,  g( I),  i.e.  the  alternative  expression  of 
p( I),  is  not  a  proper  choice  for  the  multispectral  images  with  non-uniform  subsets  though  it  is  a 
reasonable  choice  for  the  multispectral  images  with  uniform  subsets.  Therefore,  we  propose  to 
use  a  mixture  of  multivariate  probability  density  functions  as  the  statistical  information 
of  vector- valued  image  intensity  within  a  subset,  such  that 

PiO)  =  p(m) 

K 

=  ^2akpi(l\k),  Vi,  (5.6) 

k= 1 

where  pi(I\k)  denotes  the  conditional  PDF  of  I  on  the  condition  that  I (x,y)  G  and  I  is 
generated  by  a  sub-class  k.  The  corresponding  objective  function  equivalent  to  equation  4.9 
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using  the  proposed  density  model  is  given  by 

K 

e(x,y\pi)  = -log^2akp{l(x,y)\(x,y)  eQi,k),  V(x,y)eft,  Vi,  (5.7) 

k=  1 

and  the  corresponding  objective  function  equivalent  to  equation  4.13  using  the  proposed  density 
model  is  given  by 

K 

ei(x,y)  = -log^2akp(I(x,y)\I(x,y)  eli,k),  V(x,y)  < Eft,  Vi.  (5.8) 

fe=i 
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Chapter  6 

Density  Estimation  Methods 


We  have  proposed  to  use  a  multivariate  mixture  density  function  as  the  conditional  PDF  of 
vector- valued  image  intensity  Pi(T).  In  this  chapter,  we  discuss  the  estimation  methods  of  mul¬ 
tivariate  mixture  density  functions  as  a  topic  of  statistical  pattern  recognition.  Estimating  a 
mixture  density  function,  often  called  mixture  fitting ,  has  been  one  of  the  most  difficult  tasks  in 
statistical  pattern  recognition.  A  multi-dimensional  feature  space  makes  the  estimation  prob¬ 
lem  even  more  difficult. 

Let  N  data  samples  1  be  given  as  I  =  (1(1),  1(2), . . . ,  I (n), . . . ,  I(7V)},  where  I  G  ?R:B  denotes 
a  multi-dimensional  random  variable.  The  PDF  of  a  random  variable  can  be  estimated  from 
the  data  samples  1  if  the  statistical  distribution  of  the  data  samples  approximates  the  true 
PDF  p( I),  such  as 

p(I)«p(T).  (6.1) 

The  base  assumption  of  this  approximation  is  that  each  data  sample  I (n)  is  generated  indepen¬ 
dently  and  ideally  from  the  true  PDF  p( I). 

There  are  two  approaches  to  estimate  p( I)  based  on  the  ways  to  represent  a  mixture  density 
function:  parametric  and  non-parametric.  Parametric  approaches  consider  p(l)  as  a  mixture 
of  multiple  continuous  parametric  density  functions,  while  non-parametric  approaches  consider 
p(I)  as  a  discrete  function.  Section  6.1  discusses  parametric  density  estimation  methods,  and 
section  6.2  discusses  non-parametric  density  estimation. 


6.1  Parametric  Density  Estimation  Methods 


In  the  parametric  approaches,  a  collection  of  parameters 


{«!,  .  .  .  ,  Oik,  .  .  .  ,  OiK }, 
(0i,  ...,0k,---,  Ok} 


(6.2) 


represent  a  mixture  density  function  p(I),  where  each  parameter  set  Ok  specifies  the  conditional 
PDF  of  a  vector- valued  image  intensity  I  on  the  condition  that  I  is  generated  by  sub-class  k,  and 
Ok  specifies  the  relative  weight  of  the  corresponding  sub-class.  The  mixture  density  function 


35 


p( I)  can  be  approximated  by  estimating  a  parameter  set  ©  from  data  samples  Z,  such  that 

K 

p(I)  =p(I|©)  «p(X|0)  =  (6-3) 

k= 1 

where  each  6 &  is  estimated  based  on  a  particular  stochastic  model  such  as  a  Gaussian  distribu¬ 
tion  p(I|/x,  E).  As  a  parametric  density  function,  we  propose  to  use  a  mixture  of  multivariate 
Gaussian  distributions  for  p(I|0),  such  that  the  estimated  PDF  of  a  sub-class  is  given  by 

p{l\ek)  =  p(X\fik,tk).  (6.4) 

The  stochastic  model  above  is  an  arbitrary  random  distribution  whose  only  prior  knowledge 
is  that  the  statistics  of  each  sub-class  k  is  representable  as  a  weighted  multivariate  Gaussian 
distribution  akp(I\/xk,  E&).  There  are  four  sets  of  parameters  to  be  estimated:  the  number  of 
sub-classes  A,  the  weight  of  each  sub-class  {<a&},  the  mean  vector  of  each  sub-class  and 

the  covariance  matrix  of  each  sub-class  {E&}. 

Based  on  the  assumption  that  each  data  sample  I (n)  is  generated  independently,  the  likeli¬ 
hood  measured  from  the  data  samples  1  is  given  by 

N 

p(l\&)  =  J]p(I(n)l©),  (6-5) 

71=1 

and  the  log-likelihood  of  a  K-component  mixture  model  is  given  from  equation  6.3  and  6.5  as 

N  N  K 

logp(X|0)  =  log  ]]p(I(ri)|0)  =  ^2\og^2akp(I(n)\Gk)  .  (6.6) 

77=1  77=1  k= 1 

It  is  well  known  that  the  maximum  likelihood  (ML)  estimate  [141,  143] 

®ml  =  arg  nmxjlog  p(I\®)}  (6.7) 

or  the  Bayesian  maximum  a  posteriori  (MAP) 

®  map  =  argmax{logp(X|0)  +  logp(0)}  (6.8) 

cannot  be  found  analytically  for  this  mixture  density  estimation  problem  [144,  145].  Therefore, 
an  iterative  learning  algorithm,  e.g.  the  EM  algorithm,  has  been  recommended  [141]  as  an  at¬ 
tractive  solution. 


6.1.1  EM  Algorithm 

The  expectation-maximization  (EM)  methods  [146,  147,  141,  148,  149]  have  been  popularly 
used  to  obtain  the  ML  or  MAP  estimates  of  the  mixture  parameters.  The  EM  algorithm  is  an 
iterative  procedure  which  finds  local  maxima  of  logp(X|@)  or  logp(X|0)  +logp(0)  interpreting 
the  observed  samples  1  as  incomplete  data.  For  the  mixture  density  model,  the  missing  part  is 
given  as  a  set  of  labels 

z  =  M1),  z(2),  •  •  • ,  z(n), . . . ,  z(iV)}  ,  (6.9) 
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where  each  label  is  a  multi-dimensional  binary  vector 


z(n)  =  [z1(n),Z2(n),...,zk(n),...,zK(n)]T  €  {0, 1}K  (6.10) 

indicating  which  sub-class  produced  the  observed  sample  I (n).  That  is,  zj^{n)  =  1  if  and  only  if 
I  (n)  is  generated  by  the  sub-class  k.  Otherwise,  Zk(n)  —  0.  The  complete  log-likelihood,  i.e.  the 
one  from  which  we  could  estimate  ©  if  the  complete  data  X  —  {X,  Z}  was  observed  [149],  is 
given  by 

N  K 

logger,  z|©)  =  EE  Zk(n)\og[akp(I(n)\ek)]  .  (6.11) 

n= 1 k= 1 

A  time- varied  estimate  ©(£)  is  produced  by  iteratively  applying  two  sequential  steps,  i.e.  ex¬ 
pectation  and  maximization,  until  the  given  convergence  criterion  is  met. 


In  the  expectation  stage,  often  called  E-step ,  the  conditional  expectation  of  the  complete 
log-likelihood  is  computed  given  1  and  the  current  estimate  ©(£).  Since  the  elements  of  Z  are 
binary,  their  conditional  expectation  is  given  by 


W  =  E[z\i,@(t)\ 

where  each  element  is  given  by 

wk(n)  =  E[zk(n)\l,@(t)\ 

=  Pr[zk(n)  =  l|I(n),  ©(*)] 

with  the  unit  volume  condition 

K 

y^wfc(n)  =  1  . 

k=  1 

Equation  6.13  may  be  interpreted  as  an  instance  of  Bayes  law,  i.e.  &k(t)  and  Wk(n)  are  re¬ 
spectively  equivalent  to  the  a  priori  probability  and  the  a  posteriori  probability  of  Zk(n)  =  1 
after  observing  I (n).  Since  logp(Z, Z\@)  is  linear  with  respect  to  the  missing  X,  the  condi¬ 
tional  expectation  of  the  complete  log-likelihood  is  given  by  substituting  W  =  E[Z\Z,®(t)] 
into  logp(X,  Z\®),  such  as 


(6.12) 

(6.13) 

(6.14) 


Q(0,0(t)) 


E 


iogP(i,z\e)\i,e(t) 


logp(X,W|©) 

N  K 

EE  wk(n)\og[akp(l(n)\Ok)\  . 

n= 1 k= 1 


(6.15) 


The  estimated  parameter  set  ©  is  the  one  maximizing  the  likelihood  function  above. 


In  the  maximization  stage,  often  called  M-step ,  the  estimated  parameters  are  updated  ac¬ 
cording  to 

@{t  +  1)  =  argmax{Q(©,  ©(£))  +  logp(©)}  ,  (6.16) 
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in  the  case  of  MAP  estimation,  or 


0(i  +  1)  =  argmax{Q(0,  0(t))}  , 


(6.17) 


for  the  ML  criterion. 

Despite  its  popular  use,  the  EM  algorithm  for  mixture  density  estimation  has  several  draw¬ 
backs:  First,  the  number  of  sub-classes  K  should  be  provided  as  a  prior  knowledge.  This 
requirement  limits  the  use  of  EM  algorithm  for  unsupervised  learning.  Second,  K  is  fixed 
during  the  estimation  process,  so  the  EM  algorithm  may  converge  to  the  boundary  of  the 
parameter  space.  For  example,  the  weight  a &  may  become  zero  and  consequently  the  corre¬ 
sponding  parameters  specifying  the  sub-class  k  become  singular  if  K  is  significantly  higher  than 
the  unknown  true  number  of  sub-classes.  Third,  it  is  a  local  (greedy)  method,  thus  sensitive 
to  initial  values  of  parameters  6 k(t  =  0).  If  —  0)  is  too  far  way  from  the  true  parameter 

the  EM  algorithm  may  not  converge  to  the  true  or  optimal  parameters  because  the  EM 
algorithm  cannot  move  sub-classes  across  low  likelihood  regions1  [144,  150].  Common  solutions 
for  these  problems  are:  using  multiple  random  starts  and  choosing  the  final  estimates  with  the 
highest  likelihood  [151,  148,  149,  152]  or  initialization  by  clustering  [151,  148,  149].  Figure  6.1 
shows  the  results  of  the  standard  EM  algorithm  using  different  K.  1000  random  samples  I  were 


(a)  (b) 

Figure  6.1:  The  performance  of  EM  algorithm  according  to  the  number  of  sub-classes  assumed: 
(a)  K  =  4,  (b)  K  =  10 

generated  from  a  mixture  of  4  Gaussian  density  functions.  The  (black)  solid  curves  present  the 
true  PDF  p( I)  which  generated  the  observed  data  samples.  The  (red)  solid  curves  present  the 
estimated  density  function  p(l |0),  and  the  (red)  dotted  curves  present  the  weighted  density 
functions  of  sub-classes,  given  by  akp(Z\0k)-  A  successful  estimation  is  determined  by  how  close 
the  p{I |©)  is  estimated  to  the  p{ I).  In  figure  6.1(a),  the  EM  algorithm  cannot  find  any  of  the 
true  sub-classes  with  K  —  4  because  the  initializations  of  the  parameters  {#&(£  =  0)}  are  too 
far  away  from  the  true  position  and  too  close  from  the  neighbor  sub-classes,  while  it  finds  all 
four  true  sub-classes  with  K  —  10  as  shown  in  figure  6.1(b). 


1Note  the  term  region ,  used  here,  exists  in  a  parameter  space.  The  region ,  used  in  segmentation,  exists  on  a 
spatial  domain. 
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6.1.2  EM  Algorithm  with  MML 

The  estimation  of  the  number  of  sub-classes  K  [153]  has  been  an  important  issue  in  mixture 
density  estimation.  If  K  is  too  high,  the  density  estimation  may  fail  to  converge  to  the  true 
density  due  to  singular  parameters,  while  the  estimated  density  function  cannot  approximate 
the  true  underlying  mixture  density  function  if  K  is  too  low.  Figueiredo  and  Jain  proposed  an 
advanced  EM  method  [144]  based  on  the  minimum  encoding  length  criteria,  which  automati¬ 
cally  selects  the  number  of  sub-classes  K  by  annihilating  the  weak  candidates  of  sub-classes. 


The  minimum  encoding  length  criteria  are  popular  concepts  in  the  encoding  and  error 
checking  problem  in  the  communication  engineering  area,  and  they  are  also  widely  applied  in 
pattern  classification  area.  According  to  the  minimum  encoding  length  criteria,  e.g.  Minimum 
Message  Length  (MML)  [154,  155]  and  Minimum  Description  Length  (MDL)  [156],  a  good  data 
generation  model,  i.e.  a  parameter  set  0,  is  a  representation  which  minimizes  the  length  of 
code  required  to  describe  the  data  samples  1  [144,  157].  Let  the  data  samples  X,  known  to  have 
been  generated  according  to  p(I |0),  be  encoded  and  transmitted.  If  p(I |0)  is  fully  known  to 
both  the  transmitter  and  the  receiver,  they  can  both  build  the  same  code,  and  communication 
can  proceed.  However,  if  the  parametric  model  ©  is  unknown,  the  transmitter  has  to  start  by 
estimating  and  transmitting  ©.  This  leads  to  a  two-part  message,  whose  total  length  is  given 

by 

£(©,T)  =  £(©)  +  £(Z|©)  .  (6.18) 

The  estimated  parameter  set  ©  is  the  one  minimizing  this  length  function. 


According  to  the  standard  two-part  code  formulation  of  MDL  and  MML,  the  expected 
number  of  data  samples  generated  by  sub-class  k  is  iVa^,  where  N  denotes  the  number  of  data 
samples.  Thus,  the  optimal,  in  the  MDL  sense,  code  length  of  each  6 &  is  (M/2)  log(7V<a&), 
where  M  denotes  the  number  of  parameters  specifying  6 As  the  zero- weighted  sub-class  is 
considered  to  be  removed,  we  need  to  code  only  non-zero  weighted  ^  0  sub-classes.  The 
optimal  parameter  set  ©  is  estimated  by  minimizing  the  cost  function 


k:a^>  0 


,  tfn*,  N  t  Knz(M  + 1) 

+  ^l0gl2  + - 2 - 


-  logp(I|0)  , 


(6.19) 


with  respect  to  ©  given  by 

©  =  argmin£(©,X)  ,  (6.20) 

where  Knz  denotes  the  number  of  sub-classes  with  non-zero  weight  7^  0.  From  the  Bayesian 
point  of  view  [158],  the  cost  function  shown  above  is  equivalent,  for  fixed  Knz ,  to  an  a  posteriori 
density  resulting  from  the  adoption  of  Dirichlet-type  prior  for  the  weights  {<a&},  given  by  [144] 


p(aq, . . . ,  olk)  oc  exp 
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E  los ak 

k= 1 


(6.21) 


and  a  flat  prior  leading  to  ML  estimates  for  the  parameters  { 6 specifying  each  sub-class  k. 
The  EM  algorithm  to  minimize  the  cost  function  in  equation  6.19,  with  fixed  Knz ,  has  the 
following  M-step: 
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(6.23) 


0fc(£  +  1)  =  argmaxQ(©,  ©(£))  ,  {fc  :  +  1)  >  0}  , 

&k 

where  the  conditional  expectation  Wk(n)  are  given  by  the  E-step  in  equation  6.13.  The  M-step 
defined  by  equation  6.22  and  equation  6.23  do  the  actual  component  annihilation  by  setting  the 
weight  OLk  as  zero. 

The  parameter  set  0 &  specifying  the  sub-classes  for  which  &k(t  +  1)  =0  become  irrelevant 
because  any  sub-class  for  which  —  0  does  not  contribute  to  the  log-likelihood  in  equation  6.6, 
thus  the  algorithm  explicitly  removes  the  weakest  sub-class,  decreasing  Knz.  This  prevents  the 
estimation  algorithm  from  approaching  the  boundary  of  the  parameter  space,  one  of  the  draw¬ 
backs  of  the  standard  EM  algorithm  for  mixture  density  model.  This  algorithm  also  solves 
the  initialization  problem  by  starting  with  very  high  number  of  initial  sub-classes  Kina  and 
removing  weak  sub-classes  step  by  step.  The  initial  parameters  {Ok(t  =  0)}  can  be  located 
anywhere  along  the  whole  parameter  space.  Although  this  algorithm  still  does  not  guarantee 
the  global  convergence,  it  does  provide  a  flexible  way  to  deal  with. 

The  direct  use  of  the  standard  EM  algorithm  with  M-step  in  equation  6.22  and  6.23  has 
a  failure  mode,  i.e.  if  the  number  of  initial  sub-classes  Kina  is  too  large,  no  sub-class  has 
enough  initial  support  J2n= iwk(n)  >  M/2  ,  Mk  and  consequently  will  be  undetermined. 
Component-wise  EM  (CEM)  algorithm  [160]  avoids  this  problem  by  updating  each  element  of 
parameter  set  ©  =  {{a^},  {#&}}  sequentially  instead  of  simultaneously.  That  is,  update  aq, 
01,  recompute  W,  and  update  OL2  and  02,  recompute  W  again,  and  so  on.  If  one  sub-class 
is  annihilated  &k(t  +  1)  =  0,  its  probability  mass  is  immediately  redistributed  to  the  other 
sub-classes  with  non-zero  weight  &k(t  +  1)  ^  0.  This  consequently  increases  their  chance  of 
survival,  and  allows  initialization  with  an  arbitrarily  large 

Figure  6.2  shows  the  performance  of  this  method  applied  to  the  same  data  samples  used 
for  the  standard  EM  method  in  figure  6.1.  1000  random  samples  were  generated  by  a  mixture 


Figure  6.2:  The  advanced  EM  method  proposed  by  Figueiredo  and  Jain  applied  to  the  same 
data  used  in  figure  6.1.  The  estimation  starts  with  Kina  =  32  and  converges  to  K  =  4. 

density  function  p(I),  presented  as  the  (black)  solid  curves.  Randomly  chosen  32  sub-classes  are 
initialized  Kinu  =  32,  and  then  the  weakest  sub-class  is  removed  if  a  criterion  is  satisfied.  After 
the  iterative  EM  processing,  it  eventually  converges  to  the  four  sub-classes  K  —  4,  presented 
as  the  (red)  dotted  curves.  The  estimated  mixture  density  function  p(l |©),  presented  as  the 
(red)  solid  curves,  successfully  estimates  the  true  mixture  density  function  p(T). 
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6.2  Non-parametric  Density  Estimation 

In  non-parametric  approaches,  a  discrete  function  instead  of  a  mixture  of  parametric  density 
functions  approximates  the  true  PDF  of  data  samples.  Given  independently  and  ideally  gener¬ 
ated  data  samples  X,  the  assumption  of  non-parametric  approach  is  that  we  can  approximate 
the  statistical  distribution  of  a  random  variable  I  directly  from  the  statistics  of  X  without  any 
particular  stochastic  model,  such  that 

p(I)  *  p(I)  =  h(l),  (6.24) 

where  h(-)  denotes  a  non-parametric  density  function. 

If  N  observed  data  samples  X  —  (1(1), 1(2), . . . , I(7V)}  are  generated  from  the  true  PDF 
p( I),  parametric  learning  algorithms  estimate  the  stochastic  model  of  random  variable  I  based 
on  two  assumptions:  the  statistics  of  I  follow  a  particular  stochastic  model,  e.g.  a  Gaussian  dis¬ 
tribution  p( I|/z,  E);  and  the  stochastic  model  can  represent  p(X).  If  the  data  samples  X  do  not 
satisfy  these  assumptions,  the  estimated  parametric  expressions  p(I|/h,  E)  cannot  approximate 
p( I).  Since  non-parametric  approaches  do  not  rely  on  any  particular  stochastic  model,  they  can 
estimate  the  true  underlying  PDF  of  data  samples  unless  the  data  samples  I  were  observed  in 
a  biased  manner.  This  is  an  attractive  feature  for  a  mixture  density  estimation  because  the 
true  p( I)  is  often  difficult  to  express  with  a  particular  stochastic  model. 


6.2.1  Histogram 


There  are  many  non-parametric  density  estimation  methods,  e.g.  Parzen  windows,  fc-nearest 
neighbor,  and  histogram.  Histogram,  often  called  frequency  histogram ,  is  the  most  elemental 
method  among  them.  Given  N  observed  data  samples  X,  the  histogram  divides  the  dynamic 
range  of  image  intensity  I  into  a  given  number  of  frins,  and  counts  the  number  of  data  samples 
corresponding  to  each  bin  [161] .  Because  of  this  simple  procedure,  we  use  a  multi-dimensional 
histogram  density  function2  to  estimate  the  discrete  non-parametric  PDF  of  image  intensity, 
such  that 


1  hist[X] 
N  AI 


(6.25) 


where  hist[X]  denotes  a  multi-dimensional  histogram  of  X,  AI  denotes  the  volume  of  a  histogram 
bin,  and  N  denotes  the  number  of  data  samples  X.  h(X)  should  satisfy  the  unit  volume  condition 
given  by 

X>(Z)  =  1.  (6.26) 

/ 


In  the  case  of  scalar  images,  AI  is  equivalent  to  the  width  of  a  histogram  bin,  and  the  range  of 
each  AI  do  not  overlap,  which  is  different  from  other  non-parametric  methods.  The  histogram 
density  function  h(X)  approximates  p( I)  in  a  discrete  manner.  Figure  6.3  shows  an  example  of 
the  histogram  density  function  h(X )  of  the  same  data  samples  used  in  the  previous  graphs.  The 
width  of  each  bin  AI  is  set  to  1  in  figure  6.3(a),  and  set  to  3  in  figure  6.3(b).  With  the  same 
number  of  data  samples,  h(I )  tends  to  be  smoother  losing  its  precision  as  AI  increases,  while 


2Although  h(T)  shown  in  equation  6.25  is  occasionally  referred  to  histogram  or  frequency  histogram  in  other 
publications  [161],  it  is  defined  as  histogram  density  function  in  this  document.  Histogram  and  frequency  his¬ 
togram  are  referred  to  the  frequency  number  for  each  bin,  i.e.  the  histogram  in  common  sense. 
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Figure  6.3:  An  example  of  the  histogram  density  function  h(I):  (a)  AI  =  1,  (b)  AI  =  3 


h(I)  tends  to  be  more  sparky  as  AI  decreases.  If  AI  is  infinitely  narrow  and  N  is  infinitely 
high  at  the  same  time,  the  discrete  h(T)  can  ideally  approximate  the  true  PDF  p( I),  such  that 

p(I)  «  lim  Jim  h(T).  (6.27) 

AI^O  N^OO 

The  process  of  computing  a  histogram  density  function  h{I)  is  simple  and  easy  to  implement. 
Also,  h(I)  does  not  require  any  iterative  learning  process,  and  consequently  requires  shorter 
computation  time.  Since  active  contours  have  an  iterative  process  demanding  long  computation 
time,  h(T)  is  a  quite  attractive  density  estimation  method  for  active  contours.  However,  the 
accumulation  procedure  requiems  a  huge  amount  of  memory  depending  on  the  dimensionality 
of  I.  For  example,  a  histogram  of  a  24bit  RGB  image  intensity  consists  of  224  =  16777216  bins, 
128Mbytes  if  8bytes/bin. 

The  frequency  of  each  bin  of  a  histogram  is  independent  ignoring  the  neighborhood  in  the 
image  intensity  space.  That  is,  the  frequency  of  the  bin  a  is  independent  from  the  frequency 
of  the  bin  a  +  1.  This  localized  frequency  is  a  good  feature  for  mixture  density  model  because 
weak  sub-classes  can  survive  without  being  dominated  by  a  strong  neighbor  sub-class  as  shown 
in  figure  6.1(a).  However,  it  also  overfits  the  distribution  of  the  histogram  density  function 
h(T)  to  the  data  samples  X,  and  reduces  the  flexibility  of  the  estimated  PDF  p( I).  That  is,  the 
p( I)  from  a  set  of  data  samples  I\  may  be  different  from  the  p(I)  from  the  other  set  of  data 
samples  I2  even  if  both  X\  and  X2  are  generated  from  a  statistical  model.  A  partial  solution 
of  this  problem  is  to  provide  a  large  data  set.  This  problem  becomes  even  more  serious  partic¬ 
ularly  if  the  dimensionality  of  I  is  high.  Thus,  a  histogram  density  function  needs  extremely 
higher  number  of  data  samples  as  the  dimension  of  data  grows,  which  is  called  the  curse  of 
dimensionality  [162].  The  high  sensitivity  to  noise  is  another  problem  caused  by  the  localized 
bins.  Kernel  density  estimation  methods,  e.g.  Parzen  window  [141],  provide  more  flexible  and 
smooth  distribution  than  h(X)  at  the  expense  of  computation  time. 
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Chapter  7 

Active  Contour  Implementation 
using  Level  Set 


The  proposed  segmentation  methods  are  implemented  in  a  form  of  active  contours.  We  propose 
to  use  the  region-based  active  contour  model  using  level  set  theory.  The  level  set  implemen¬ 
tation  of  the  proposed  active  contour  model  is  based  on  the  multi-phase  active  contour  model 
proposed  by  Chan  and  Vese  [108,  74,  113].  In  this  chapter,  we  discuss  the  level  set  implemen¬ 
tation  of  the  proposed  active  contour  models. 


7.1  The  Base  Active  Contour  Model 


Multi-phase  active  contours  can  partition  an  image  into  more  than  two  subsets  simultaneously. 
Let  us  redefine  the  entire  domain  of  an  image  I (x,y)  as  a  disjoint  set  of  subsets,  such  that 


(7.1) 


where  the  interior  regions  include  the  contour  pixels  C  E  Then,  each  subset  can  be 

identified  by  a  set  of  binary  identity  functions 

1,  if  (x,  y)  €  Cli 
0,  otherwise 


Xi(x,y)  = 


,  \/(x,  y)  €  f Vi 


(7.2) 


composed  of  a  group  of  regularized  unit  step  functions  Hj  :  — >  {0, 1}  given  by 


Hj  =  He{(j>j(x,y)) 


,  v(x,y)en,  Vj. 


(7.3) 


1,  if  (f>j(x,  y)  >0 
0,  if  4>j{x, y)  <  0 

The  identity  functions  {xi(xiV)  :  ^  {0, 1}}  for  the  case  of  2  subsets  and  4  subsets  are 

respectively  defined  as  [74] 
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(7.4) 


(7.5) 
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J  level  set  functions  {</> i, . . . ,  <fij, . . . ,  cj)j}  can  compose  up  to  2J  subsets  {fli, . . . ,  Qi, . . . ,  fl2j} 
in  this  way.  An  example  of  subsets,  defined  by  multi-phase  level  set  functions,  is  shown  in 
figure  7.1  where  {flo,  ^l,  ^2,  ^3}  denote  the  four  subsets  defined  by  two  level  set  functions 

{</>!,  <M- 


Figure  7.1:  Subsets  and  contours  defined  by  two  level  set  functions,  {0i,02} 


Using  the  identity  functions  {xi(xiV)}i  the  integration  over  each  subset  Qi  in  the  global 
energy  function  E  shown  in  equation  4.11  and  4.14  can  be  transformed  to  the  integration  over 
the  entire  image  plane  O,  such  as 


/  ei(x,  y)dxdy  =  /  ei(x,y)xi(x,y)dxdy,  Vi, 
Jq,  Jn 


(7.6) 


which  makes  the  computation  much  easier.  Also,  the  length  of  contours  \Cj\  is  equivalent  to 
the  integration  of  |ViTj|  over  fl,  such  that 


\Cj\  =  [  iVH^jix.y^dxdy,  Vj, 
Jn 


(7.7) 


where  Cj  denotes  a  set  of  active  contours  formed  by  the  corresponding  level  set  function  c fij(x ,  y) 
as  Cj  =  {(x,y)  :  (j)j(x,y )  =  0}.  The  global  energy  function  of  the  multi-phase  active  contour 
model  and  the  associated  Euler-Lagrange  equation  obtained  by  minimizing  the  energy  function 
E  with  respect  to  4>  =  {(j)  1, . . . ,  </>j, . . . ,  which  are  introduced  in  [74,  113],  can  be  generalized 
with  an  arbitrary  form  of  objective  functions  e^(x,?/),  such  as 


2J  J 

E  =  E  /  ej(x,y)dxdy  +  i/^  \Cj\ 

i= 1  j= 1 

2J  J 

=  E  /  ei(x,y)xi(x,y)dxdy  +  i/W  /  |Vtfj|d.xdy, 
i=i  Jn  3= 1 


(7.8) 


and 


dfpjjx^y) 

dt 


—  $ 3 


-^2ei(x,y) 


2=1 


dHi 


,  Vj, 


(7.9) 


where  <5y  =  Se(<j>j(x,y))  denotes  the  first  derivatives  of  Hj  with  respect  to  <j)j,  and  Kj  = 
K(<f>j(x,y ))  denotes  the  mean  curvature  of  <f>j(x,y). 


44 


7.2  Proposed  Active  Contour  Model 


The  proposed  active  contour  model  is  obtained  by  substituting  the  two  proposed  objective 
functions  e(x,y\pi)  in  equation  4.9  and  ei(x,y)  in  equation  4.13  into  the  generalized  multi¬ 
phase  active  contour  model  shown  in  equation  7.9,  such  that 


dt 


=  5< 


VK, 


+  5T°g  P*(I(®,J/)) 


2=1 


dXi 

dHn 


,  Vj, 


(7.10) 


where pi(I(x,y))  =  p(I(x,y)\(x,y)  G  f^)  for  unsupervised  segmentation  model,  &ndpi(I(x,y))  = 
p(I(x,  y)|I(x,  y)  G  T{)  for  supervised  segmentation  model.  The  conditional  PDF  p^(-)  could  be 
either  a  parametric  expression  or  a  non-parametric  expression.  We  propose  to  use  a  mixture 
of  multivariate  density  functions  as  shown  in  equation  6.3  for  supervised  segmentation  model. 
More  detail  will  be  discussed  in  chapter  9  and  10.  We  also  propose  to  use  a  histogram  density 
function  as  shown  in  equation  6.24  and  6.25  for  unsupervised  segmentation  model.  More  detail 
will  be  discussed  in  chapter  8.  The  actual  level  set  evolution  equations  for  the  case  of  two 
subsets  and  four  subsets  are  respectively  given  by 


94>{x,  y) 

dt 
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(7.11) 


(7.12) 


where  logp^  =  \ogpi(I(x,y)).  Sj  =  5e((/)j(x,y))  is  not  necessarily  a  Dirac  delta  function,  but  it 
performs  a  role  of  window  like  a  bandpass  filter  which  controls  the  width  e  to  update  cj)j(x,y). 
The  level  set  function  (f)j(x,y)  only  where  \<j)(x,y)  \  <  £,  i.e.  around  corresponding  contours  Cj, 
are  updated  and  the  rest  of  area  is  ignored.  The  curvature  term  Kj  keeps  Cj  in  a  smooth  shape. 
During  the  the  contour  evolution,  pi(I(x,y))  for  supervised  segmentation  methods  returns  the 
probability  of  I (x,y)  under  the  condition  that  I (x,y)  is  an  element  of  given  training  samples 
2 i,  while  pi(I(x,y))  for  unsupervised  segmentation  methods  returns  the  probability  of  I (x,y) 
under  the  condition  that  I (x,y)  is  an  element  of  the  subset  These  conditional  probabilities 
provide  a  force  which  changes  the  corresponding  level  set  functions,  i.e.  propagates  the  corre¬ 
sponding  active  contours,  and  consequently  partitions  the  given  image  satisfying  the  desired 
property,  which  is  either  uniformity  of  I  within  or  similarity  between  the  statistics  of  I  and 
2 i.  The  terms  inside  of  {•}  in  above  expressions  provide  segmentation  force,  while  the  curva¬ 
ture  terms  provide  a  regularity  force.  Constant  v  controls  the  balance  between  these  two  forces. 
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Chapter  8 

Unsupervised  Active  Contour  Model 
using  Multi-dimensional  Histograms 


In  this  chapter,  we  propose  an  unsupervised  active  contour  model  using  the  non-parametric 
density  estimation  method  introduced  in  section  6.2.1.  The  proposed  active  contour  model 
measures  a  non-parametric  PDF  expression  of  each  subset  at  each  iteration,  and  updates  level 
set  functions  based  on  the  measured  PDF  expressions.  Section  8.1  discuss  the  proposed  unsu¬ 
pervised  image  segmentation  model.  We  propose  to  use  a  multi-dimensional  histogram  density 
function  as  a  discrete  non-parametric  density  function  of  image  intensity  within  each  subset. 
The  detail  of  this  approach  is  discussed  in  section  8.2.  Section  8.3  discusses  the  implementation 
of  active  contours  using  level  set.  Section  8.4  shows  the  numerical  algorithm  of  the  proposed 
method,  and  section  8.5  shows  the  experimental  results  applied  on  synthetic  and  real  images. 


8.1  Unsupervised  Image  Segmentation 

The  proposed  method  follows  the  classical  role  of  image  segmentation,  which  partitions  an 
image  without  any  prior  knowledge  or  training  samples,  using  a  global  energy  minimization. 
The  global  energy  function  is  given  by 

E({pi},C)  =  W  [  e(x,y\pi)dxdy  +  u\C\,  (8.1) 

i  M 

and  the  global  energy  E  is  minimized  with  respect  to  two  expressions:  {pi},  i.e.  the  statistical 
representation  of  the  image  within  a  subset  and  the  variational  contours  C.  First,  the 
minimum  E  with  respect  to  {pi}  is  achieved  while  C  is  fixed.  Then,  the  minimum  E  with 
respect  to  C  is  achieved  while  {pi}  are  fixed.  An  objective  function 

e(x,y\pi)  = -log pi,  V(.r. //)  e  <h  Vi,  (8.2) 

determines  the  region-based  segmentation  criteria  based  on  the  statistical  distribution  of  image 
intensity  within  each  subset  fV  As  e(-)  has  the  minimum  value  where  pi  has  the  maximum 
value,  the  minimum  E  with  respect  to  {pi}  is  achieved  where  every  pi  has  the  maximum.  This 
is  done  by  a  similar  way  to  the  simplified  Mumford-Shah  segmentation  model  introduced  in 
section  4.2  except  that  pi  is  a  statistical  distribution  of  image  intensity  here,  instead  of  a  simple 
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expression  fa.  The  best  pi  satisfying  the  segmentation  criteria  is  a  conditional  PDF  of  image 
intensity  on  the  condition  that  the  image  pixel  is  an  element  of  given  by 

Pi  =p{l(x,y)\(x,y)  e  Qi),  Vi,  (8.3) 

where  I (x,y)  :  0  — »•  $t,B  denotes  a  vector- valued  image  pixel.  The  estimated  p;  provides  a 
force  to  propagate  the  variational  contours  C  to  the  proper  position  where  C  divides  the  en¬ 
tire  domain  Q  of  the  image  I (x,y)  into  multiple  subsets  {f^}.  This  performs  an  unsupervised 
image  segmentation.  Then,  the  minimization  of  E  with  respect  to  C  smoothes  the  variational 
boundaries  C  minimizing  \C\. 

As  the  proposed  method  does  not  require  any  prior  knowledge  or  training  samples,  the  use 
of  proposed  method  is  convenient  and  flexible,  so  it  can  be  used  in  any  kinds  of  images  as  the 
first  choice.  However,  the  performance  of  segmentation  tends  to  be  limited  depending  on  the 
character  of  the  given  images.  It  is  particularly  limited  if  the  image  has  significant  non-uniform 
subsets.  Supervised  segmentation  methods  introduced  in  the  next  two  chapters  may  show  bet¬ 
ter  performance  in  those  non-uniform  images. 


8.2  Multi-dimensional  Histogram  Density  Function 

We  have  proposed  to  use  a  multivariate  mixture  density  function  as  the  statistical  distribution 
of  image  intensity  within  a  subset.  In  the  proposed  method,  we  propose  to  use  a  normalized 
multi-dimensional  histogram,  which  is  also  introduced  as  a  multi-dimensional  histogram  den¬ 
sity  function  in  section  6.2.1,  as  a  non-parametric  and  discrete  expression  of  the  multivariate 
mixture  density  function. 

Let  1  —  (1(1),  1(2), . . . ,  I(7V)}  be  the  N  observed  data  samples,  and  I  G  ?R:B  be  the  vector¬ 
valued  image  intensity.  The  histogram  density  function  of  given  data  samples  is  given  by 


where  hist[Z]  denotes  a  multi-dimensional  histogram  of  X,  and  AI  denotes  the  volume  of  a 
histogram  bin.  If  the  histogram  density  function  h(I)  is  normalized  as 

5>(J)  =  1  (8.5) 

I 

with  unlimited  number  of  data  samples,  i.e.  N  — >  oo,  and  infinitely  narrow  histogram  bins, 
i.e.  AI  — ►  0,  the  discrete  h(I)  ideally  approximates  a  continuous  PDF,  given  by 

p(I)  ~  lim  lim  h(T).  (8.6) 

AI^O  iV— ► oo 

In  the  proposed  active  contour  model,  h(T)  substitutes  the  pi  in  equation  8.1  and  11.8  as  a 
discrete  non-parametric  form  of  a  multivariate  mixture  density  function. 

In  practical  image  processing  problems,  the  number  of  data  samples  is  limited  and  the  width 
of  a  histogram  bin  cannot  be  zero.  However,  the  native  data  is  already  in  a  sampled  discrete 
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form  in  digital  image  processing,  so  the  quantization  error  caused  by  h( I)  is  reasonably  low 
if  AI  =  1.  The  simple  computation  of  a  histogram  is  another  advantage.  Although  there  are 
many  non-parametric  density  estimation  methods,  e.g.  fc-nearest  neighborhood,  Parzen  window, 
histogram  provides  fairly  reasonable  information  with  a  short  computation  time.  However,  the 
required  amount  of  memory  significantly  increases  depending  on  the  number  of  bands  B  and 
the  resolution  of  image  intensity  AI.  For  example,  a  24bit  RGB  image  requires  224  x  8  =  128 
Mbytes,  while  a  gray  image  requires  only  28  x  8  =  2  Kbytes. 

The  statistics  of  image  intensity  in  an  image  is  not  always  representable  as  a  simple  para¬ 
metric  expression,  such  as  a  Gaussian  distribution.  Since  the  proposed  method  does  not  rely 
on  a  particular  stochastic  model,  the  use  of  the  proposed  method  is  more  flexible  and  robust 
than  other  methods  using  simple  parametric  expressions.  Figure  8.1  shows  an  example  of 
non-uniform  images  whose  statistical  distribution  is  unrepresentable  with  a  simple  paramet¬ 
ric  expression.  The  wood  pattern  shown  in  figure  8.1(a)  shows  a  non-uniform  texture,  and 


Figure  8.1:  Statistical  distribution  of  image  intensity:  (a)  a  wood  pattern,  (b)  the  histogram 

the  statistical  distribution,  i.e.  a  histogram,  of  image  intensity  shown  in  figure  8.1(b)  shows 
a  high  peak  in  the  range  of  high  intensity  and  a  low  and  wide  distribution  along  the  range 
of  low  intensity.  The  statistics  of  these  data  samples  are  not  easily  representable  with  a  sim¬ 
ple  Gaussian  or  other  parametric  distributions  though  it  is  trivial  for  a  non-parametric  method. 


8.3  Contour  Evolution 

The  implementation  of  proposed  active  contour  model  is  done  using  the  multi-phase  active  con¬ 
tour  model  introduced  in  section  7.2.  We  define  multiple  level  set  functions  $  =  {</> i, . . . ,  </>j, . . . ,  cj)j} 
on  the  spatial  domain,  and  each  level  set  function  represents  a  set  of  contours  Cj.  J  level  set 
functions  can  partition  an  image  up  to  2J  subsets  as  shown  in  chapter  7. 


The  Euler-Lagrange  equation  obtained  by  minimizing  the  energy  function  E  with  respect 
to  level  set  functions  <F  =  {</> i, . . . ,  </>j, . . . ,  cj)j}  is  given  by 


9<f>j(x,y) 

dt 


+  ^2  lQg  y)  I  (x,  y)  e  ^ 


2=1 


dXi 

dH, 


,  Vj, 


(8.7) 
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where  Xi(xiV)  :  ^  {0, 1}  denotes  the  binary  identity  functions  introduced  in  equation  7.4 

and  7.5,  and  Hj  =  He((j)j(x,y))  :  14  — ►  3?  denotes  the  regularized  unit  step  function  of  </>j(x,?/), 
introduced  in  equation  7.3.  /i(I(x,  y)|(x,  y)  G  denotes  the  histogram  density  function  mea¬ 
sured  within  Qi,  and  it  determines  the  amount  of  update  in  (f)j(x,y).  Sj  =  5e((f)j(x,y))  :  ^  5R 

performs  a  function  of  a  window,  which  updates  (j>j{x,y)  only  around  the  contours  where 
\<j)j(x,y)\  <  e.  The  curvature  term  Kj  =  K(cf)j(x,y))  :  ST  — >  3^,  introduced  in  equation  3.10, 
keeps  the  regularity  of  contours,  and  v  controls  the  relative  weight  between  the  segmentation 
force  and  regularity  force. 

The  update  of  level  set  functions  is  performed  as  an  iterative  processing.  At  each  iteration, 
we  measure  h(J.(x,y)\(x,y)  G  f \)  of  each  subset,  and  update  each  level  set  function  (j)j{x,y) 
according  to  the  update  function  shown  above  until  the  active  contours  finally  converge  to  the 
position  of  desired  boundaries. 


8.4  Algorithm 


Table  8.1:  The  input  and  output  variables  used  in  the  unsupervised  multi-dimensional  histogram 
method _ 

•  Input 

—  I (x,y)  :  ft  —>  SR5,  a  multispectral  image. 

—  J,  the  number  of  level  set  functions  {4>j(x,  y)}  =  the  number  of  active 
contours  {Cj}. 

—  v,  a  parameter  to  control  the  regularity  of  contours. 

—  At,  the  time  step  of  iterative  processing. 

•  Output 

—  S(x,  y)  :  ft  — ►  {1, . . . ,  2J},  the  segmented  image 


Table  8.1  shows  the  input  and  output  variables  used  in  the  proposed  method.  A  multi¬ 
spectral  image  I (x,y)  is  given  as  the  input  data.  The  number  of  level  set  functions,  which  is 
equivalent  to  the  number  of  contours,  is  set  as  J.  v  controls  the  regularity  of  contour  evolu¬ 
tion  by  controlling  the  relative  balance  between  segmentation  force  and  regularity  force.  At 
determines  the  convergence  speed.  After  the  proposed  active  contours  converge,  the  algorithm 
produces  the  segmented  image  S(x,y)  indicating  each  subset  with  different  numbers. 

As  shown  in  algorithm  1,  the  proposed  active  contour  algorithm  consists  of  two  stages:  ini¬ 
tialization  and  contour  evolution.  During  the  initialization  stage,  J  sets  of  initial  contours  {Cj} 
are  given  either  manually  or  from  a  defined  set,  and  corresponding  level  set  functions  are  initial¬ 
ized  as  the  signed  distance  from  each  (x,  y)  to  the  closest  contours.  During  the  iterative  contour 
evolution,  the  mean  curvature  Kj(x,y)  and  the  unit  step  function  Hj{x,y)  are  computed  from 
each  level  set  function  </)j(x,y).  The  mathematical  description  of  function  meanCurvature(-) 
and  unitStep(-)  are  described  respectively  in  equations  3.10  and  7.3.  Then,  binary  identity 
functions  Xi{xiV)  and  the  histogram  density  functions  h{( I)  for  each  subset  fti  are  measured. 
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The  mathematical  description  of  functions  identityFunction(-)  and  histogramDensity(-)  are 
introduced  respectively  in  equations  7.4  and  8.4.  The  measured  hi( I)  updates  objective  func¬ 
tions  ei(x,y)  for  and  finally  level  set  functions  {<fij(x,y)}  are  updated. 


Algorithm  1:  Unsupervised  Active  Contour  Model  using  Multi-dimensional  Histogram 
Density  Functions 

Initialization 

t  —  0. 

for  j  =  1  •  •  •  J  do 

Cj{x(s),y(s),t  —  0):  Initialize  contours. 

(f)j(x,y,t  =  0)  =  ±D((x,y),N(Xfy}(Cj)):  Initialize  level  set  functions. 

Contour  evolution 
repeat 

for  j  =  1  •  •  •  J  do 

Kj(x,y)  =  meanCurvature((/)j(x,  y)) 

L  Hj(x>y)  =  unitStep(</>j(a;,j/)) 

Unsupervised  region-based  segmentation 
for  i  —  1  •  •  •  2J  do 

Xi{x,y)  =  identityFunction ({Hj(x,y),  Vj}) 
hi(  I)  =  histogramDensity(I  (x,y)xi(x,y)) 

L  ej(a;,  y)  =  -  log  hi( I(x,  y)) 

Update  level  set  function 
for  j  =  1  •  •  •  J  do 

r  2j 

<t>j{x,y)  =  4>j(x,y)  +  A t5(4>j(x,y))  vK(<t>j(x,y))  -  E  ei{x,y)gjr: 

2=1 

t  =  t  +  At 

until  until  the  evolution  of  contour  converges 

Segmentation  result 

=  E  Xi(x,y) 

2=1 


Table  8.2  shows  the  input  and  output  variables  used  in  the  histogram  density  function 
histogramDensity(Z).  The  density  function  takes  three  inputs:  data  samples  X,  the  dynamic 
range  of  image  intensity,  and  the  size  of  histogram  bins  AI.  The  dynamic  range  and  AI  de¬ 
termines  the  resolution  of  the  histogram.  After  computation,  the  density  function  produces  a 
normalized  multi-dimensional  histogram  as  a  non-parametric  and  discrete  density  function.  As 
shown  in  algorithm  2,  the  estimation  of  a  histogram  density  function  consists  of  two  stages: 
initialization  and  computation.  The  mathematical  model  of  this  method  was  introduced  in 
equation  8.4.  First,  the  initial  histogram  is  defined  based  on  the  dimension  of  the  vector-valued 
image  intensity  I  G  ?RB  and  the  given  dynamic  range.  Then,  the  corresponding  histogram  bin 
for  each  data  sample  is  accumulated,  and  the  whole  histogram  is  normalized  in  the  final  stage. 
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Table  8.2:  The  input  and  output  variables  used  in  the  multi-dimensional  histogram  density 
function _ 

•  Input 

—  X  —  (1(1),  1(2), . . . ,  I (n), . . . ,  I(7V)},  I  E  5fts,  data  samples. 

—  [(mini  maxi)T, . . . ,  (min^  max^)T, . . . ,  (min#  maxjg)T],  the  dynamic 
range  of  image  intensity  I  at  each  band. 

—  AI,  the  volume  of  a  histogram  bin. 

•  Output 

—  h( I)  :  5ft5  —>  5ft,  histogram  density  function. 


Algorithm  2:  Histogram  density  function 
Initialization 

hist  [I]:  Initiate  a  5-dimensional  histogram 

Computing  density  function 

for  n  =  1  •  •  •  N  do  hist  (I  (n))  =  hist(I(n))  +  1 
h{ I)  =  hist  (I)/ (TV  AI):  Normalization 


Figure  8.2  shows  the  iterative  contour  evolution  of  the  proposed  active  contour  model.  The 
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Figure  8.2:  The  iterative  procedure  of  proposed  active  contour  model:  (top)  contour  evolution, 
(bottom)  corresponding  segments 


figures  on  the  top  show  the  contour  evolution,  and  the  figures  on  the  bottom  show  the  corre¬ 
sponding  segments.  At  the  initial  stage,  a  set  of  initial  contours  are  given  from  a  predefined 
shape  as  shown  in  the  left  figures.  The  contours  move  separating  the  given  image,  and  finally 
converge  to  the  boundary  of  the  small  rectangle. 
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8.5  Experiments 


In  this  section,  we  apply  the  proposed  segmentation  method  to  three  different  images,  and 
compare  the  results  of  the  proposed  method  to  the  results  of  other  active  contour  models. 
Two  active  contour  models,  which  share  the  same  region-based  segmentation  principle,  were 
used  as  benchmarks:  a  piecewise-constant  active  contour  model  proposed  by  Sandberg  and 
Chan  [5,  123,  140]  and  another  active  contour  model  proposed  by  Rousson  and  Deriche  [10,  117]. 
Sandberg’s  method  will  be  referred  to  method  1,  and  Rousson’s  method  will  be  referred  to 
method  2.  All  three  methods  including  the  proposed  method  determine  the  segmentation  cri¬ 
teria  based  on  the  statistical  property  of  image  intensity  measured  within  a  subset.  Method  1 
represents  each  subset  as  a  vector- valued  constant  number.  Method  2  represents  each  subset  as 
a  multivariate  Gaussian  density  function.  Both  methods  1  and  2  use  parameters  to  represent 
each  subset.  Method  1  uses  the  mean  of  image  intensity  measured  within  the  subset,  while 
method  2  uses  the  mean  and  covariance  of  image  intensity  measured  within  the  subset.  The 
detail  of  mathematical  and  analytical  descriptions  of  both  methods  are  presented  in  section  3.4. 

Figure  8.3(a)  shows  an  example  of  synthetic  gray  images  with  textures.  There  is  a  small 


(a)  (b) 


Figure  8.3:  Synthetic  textured  image:  (a)  a  textured  gray  image,  (b)  the  ground  truth  image 

rectangle  in  the  image,  and  both  the  rectangle  and  the  background  have  their  own  texture 
pattern.  The  object  of  this  experiment  is  to  separate  the  rectangle  from  the  background.  Fig¬ 
ure  8.3(b)  shows  the  ground  truth ,  the  ideal  result  of  segmentation,  of  the  given  image.  The 
dark  part  presents  the  background,  class  1,  while  the  bright  part  presents  the  rectangle,  class 
2. 


Figure  8.4  shows  the  result  of  method  1  applied  on  figure  8.3(a).  Figure  8.4(a)  shows  the 
final  stage  of  contour  evolution,  and  figure  8.4(b)  shows  the  corresponding  segmentation  result. 
Method  1  cannot  partition  the  rectangle  as  a  single  object  as  it  separates  the  rectangle  as 
dark  sub-regions  and  bright  sub-regions,  and  the  same  effect  also  appears  on  the  background. 
Method  1  fails  in  this  problem  because  the  dark  intensity  and  the  bright  intensity  of  the  rect¬ 
angle  stay  too  far  away  in  the  image  intensity  space,  so  it  cannot  represent  the  subset  with  the 
average  value  of  the  intensity. 

Figure  8.5  shows  the  result  of  method  2  applied  on  figure  8.3(a).  Figure  8.5(a)  shows  the 
final  stage  of  contour  evolution,  and  figure  8.5(b)  shows  the  corresponding  segmentation  result. 
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Figure  8.4:  Method  1  applied  to  a  synthetic  textured  image:  (a)  the  final  stage  of  contour 
evolution,  (b)  the  segmented  subsets 


(a)  (b) 

Figure  8.5:  Method  2  applied  to  a  synthetic  textured  image:  (a)  the  final  stage  of  contour 
evolution,  (b)  the  segmented  subsets 


Method  2  shows  a  better  segmentation  result  than  method  1.  It  successfully  separates  the 
rectangle  from  the  background,  but  there  are  still  a  few  artifacts  in  the  background. 

Figure  8.6  shows  the  result  of  the  proposed  method  applied  on  the  same  image.  Figure  8.6(a) 
shows  the  final  stage  of  contour  evolution,  and  figure  8.6(b)  shows  the  corresponding  segmen¬ 
tation  result.  The  proposed  method  successfully  separates  the  rectangle  from  the  background 
without  significant  artifacts,  which  were  observed  in  the  cases  of  other  two  methods. 

For  more  precise  comparison,  we  measure  the  statistical  distribution  of  image  intensity 
within  class  2,  which  is  supposed  to  be  the  small  rectangle  in  the  segmentation  result.  Figure  8.7(a),  8.7(b), 
and  8.7(c)  respectively  present  the  results  of  method  1,  method  2,  and  the  proposed  method. 

The  blue  solid  line  of  all  three  graphs  presents  the  statistical  distribution  of  image  intensity 
within  the  rectangle  in  the  ground  truth  image.  Because  of  the  wood  pattern,  the  statistical 
distribution  of  the  rectangle  is  not  simple.  There  is  a  high  peak  at  the  region  of  high  intensity 
in  the  graph,  and  there  are  multiple  low  peaks  at  the  region  of  dark  intensity  in  the  graph. 


53 


(a)  (b) 

Figure  8.6:  Proposed  method  applied  to  a  synthetic  textured  image:  (a)  the  final  stage  of 
contour  evolution,  (b)  the  segmented  subsets 
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Figure  8.7:  Statistical  distribution  of  image  intensity  within  class  2,  the  small  rectangle:  (a) 
method  1,  (b)  method  2,  (c)  proposed  method 


The  green  solid  line  of  all  three  graphs  presents  the  same  statistics  measured  within  class  2  in 
the  segmentation  result.  As  the  green  solid  line  exists  closer  to  the  blue  solid  line,  it  presents 
better  result.  In  figure  8.7(a),  the  dotted  green  line  with  a  circle  presents  the  representation  of 
the  subset  used  in  method  1,  i.e.  the  mean  of  green  solid  line.  The  green  line  is  very  different 
from  the  ground  truth  because  the  constant  representation  is  not  sufficient  to  represent  the 
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complicated  ground  truth.  In  figure  8.7(b),  the  dotted  green  line  presents  the  representation  of 
the  subset  used  in  method  2,  i.e.  the  Gaussian  density  function  of  image  intensity  measured. 
Although  the  green  solid  line  stays  close  to  the  blue  line,  the  dotted  line  seems  quite  different 
from  the  actual  distribution,  and  it  consequently  results  in  the  artifacts  shown  in  figure  8.5. 
In  figure  8.7(c),  the  result  of  the  proposed  method  and  the  ground  truth  data  is  almost  identical. 

Figure  8.8(a)  shows  another  example  of  gray  images  with  textures.  There  are  five  objects 


(a)  (b) 


Figure  8.8:  Synthetic  textured  image:  (a)  a  textured  gray  image,  (b)  the  ground  truth  image 

in  a  synthetic  gray  image:  a  circle,  a  rectangle,  a  horizontal  ellipse,  a  vertical  ellipse,  and  a 
triangle.  The  circle  and  vertical  ellipse  have  uniform  image  intensity,  but  the  triangle  and  the 
horizontal  ellipse  have  textured  pattern.  The  rectangle  has  a  gradient  intensity.  The  overall 
image  has  some  noise,  so  the  edges  between  each  object  and  the  background  are  not  clear.  The 
object  of  this  experiment  is  to  separate  all  objects  from  the  background  forming  two  classes, 
i.e.  background  and  objects.  Figure  8.8(b)  shows  the  ground  truth  of  this  experiment.  Class 
1,  i.e.  background,  appears  as  a  bright  sub-region,  while  class  2,  i.e.  objects,  appear  as  dark 
sub-regions. 

Figure  8.9  shows  the  result  of  method  1  applied  on  figure  8.8(a).  Figure  8.9(a)  shows  the 
final  stage  of  contour  evolution,  and  figure  8.9(b)  shows  the  corresponding  segmentation  result. 
Although  method  1  successfully  partitions  the  circle  and  vertical  ellipse  as  class  2,  it  completely 
fails  to  separate  the  horizontal  ellipse  from  the  background.  Also,  only  some  portions  of  the 
rectangle  and  triangle  appear  as  class  2  in  the  final  result  due  to  their  textured  pattern  and 
gradient  intensity.  Since  method  1  represents  two  classes  as  two  constant  numbers,  it  cannot 
recognize  an  object  if  the  image  intensity  of  the  object  is  not  uniform. 

Figure  8.10  shows  the  result  of  method  2  applied  on  figure  8.8(a).  Figure  8.10(a)  shows 
the  final  stage  of  contour  evolution,  and  figure  8.10(b)  shows  the  corresponding  segmentation 
result.  Method  2  shows  a  better  segmentation  result  than  method  1  successfully  separating 
the  circle,  vertical  ellipse,  triangle,  and  rectangle  from  the  background.  However,  there  are 
still  a  few  artifacts  on  the  horizontal  ellipse  due  to  the  textured  pattern  because  the  statistical 
distribution  of  the  texture  pattern  stays  out  of  the  Gaussian  density  function  of  class  2. 

Figure  8.11  shows  the  result  of  the  proposed  method  applied  on  the  same  image.  Figure  8.11(a) 
shows  the  final  stage  of  contour  evolution,  and  figure  8.11(b)  shows  the  corresponding  segmen- 
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(a)  (b) 

Figure  8.9:  Method  1  applied  to  a  synthetic  textured  image:  (a)  the  final  stage  of  contour 
evolution,  (b)  the  segmented  subsets 


(a)  (b) 

Figure  8.10:  Method  2  applied  to  a  synthetic  textured  image:  (a)  the  final  stage  of  contour 
evolution,  (b)  the  segmented  subsets 


tation  result.  The  proposed  method  successfully  separates  all  five  objects,  even  the  horizontal 
ellipse  with  texture  pattern,  from  the  background  without  significant  artifacts.  This  experiment 
shows  how  robust  the  proposed  method  is  in  the  segmentation  problems  of  non-uniform  images. 

Figure  8.12  shows  an  example  of  outdoor  RGB  images.  This  color  image  consists  of  three 
bands:  red,  green,  blue.  In  the  picture,  there  is  a  gray  squirrel  in  the  green  grass  field.  The  back 
of  the  squirrel  shows  a  mix  of  gray  color,  the  stomach  side  of  the  squirrel  shows  a  mix  of  pink, 
and  the  tail  shows  a  mix  of  pink  and  brown.  Due  to  the  texture  of  fur,  there  are  a  significant 
number  of  narrow  shadows  and  sharp  edges  along  the  whole  body  of  the  squirrel  as  well  as 
the  grass  field.  The  object  of  this  experiment  is  to  separate  the  squirrel  from  the  grass  field, 
i.e.  background.  This  is  a  difficult  segmentation  problem  because  the  color  of  neither  squirrel 
or  grass  field  is  uniform.  Although  this  non-uniform  texture  is  a  common  problem  found  in  any 
outdoor  color  images,  it  makes  image  segmentation  very  difficult. 

Figure  8.13  shows  the  result  of  method  1  applied  on  figure  8.12.  Figure  8.13(a)  shows  the 
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(a)  (b) 

Figure  8.11:  Proposed  method  applied  to  a  synthetic  textured  image:  (a)  the  final  stage  of 
contour  evolution,  (b)  the  segmented  subsets 


Figure  8.12:  An  outdoor  RGB  image 


(a)  (b) 

Figure  8.13:  Method  1  applied  to  an  outdoor  RGB  image:  (a)  the  final  stage  of  contour 
evolution,  (b)  the  segmented  subsets 


final  stage  of  contour  evolution,  and  figure  8.13(b)  shows  the  corresponding  segmentation  result. 
Because  of  the  strong  sharp  texture  pattern  on  the  squirrel  and  the  grass  field,  method  1  pro¬ 
duces  a  lot  of  noise-like  artifacts  but  cannot  properly  separate  the  squirrel  from  the  background. 
Both  the  grass  field  and  the  squirrel  are  too  complicated  to  be  represented  as  a  constant  number 
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due  to  the  presence  of  many  small  edges  and  shadows. 

Figure  8.14  shows  the  result  of  method  2  applied  on  figure  8.12.  Figure  8.14(a)  shows  the 


(a)  (b) 

Figure  8.14:  Method  2  applied  to  an  outdoor  RGB  image:  (a)  the  final  stage  of  contour 
evolution,  (b)  the  segmented  subsets 

final  stage  of  contour  evolution,  and  figure  8.14(b)  shows  the  corresponding  segmentation  re¬ 
sult.  Method  2  shows  a  better  segmentation  result  than  method  1  successfully  separating  the 
squirrel  from  the  grass  field.  However,  there  are  still  a  few  artifacts  on  the  background  due  to 
the  small  edges  and  shadows  of  grass  field. 

Figure  8.15  shows  the  result  of  the  proposed  method  applied  on  the  same  image.  Figure  8.15(a) 


(a)  (b) 

Figure  8.15:  Proposed  method  applied  to  an  outdoor  RGB  image:  (a)  the  final  stage  of  contour 
evolution,  (b)  the  segmented  subsets 

shows  the  final  stage  of  contour  evolution,  and  figure  8.15(b)  shows  the  corresponding  segmen¬ 
tation  result.  The  proposed  method  successfully  separates  the  squirrel  from  the  grass  field,  and 
the  number  of  noise-like  artifacts  is  much  lower  than  method  2  as  well  as  method  1.  This  exper¬ 
iment  shows  that  the  proposed  method  is  robust  in  the  segmentation  problems  of  multispectral 
images  with  non-uniform  subsets,  which  is  common  in  outdoor  color  images. 
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Chapter  9 

Half-supervised  Active  Contour 
Model  using  Multivariate  Gaussian 
Mixture  Density  Functions 


In  this  chapter,  we  propose  a  half-supervised1  active  contour  model  using  a  parametric  density 
estimation  method  introduced  in  section  6.1.1.  The  proposed  method  follows  the  same  region- 
based  segmentation  principle  introduced  in  chapter  8  except  for  two  major  modifications.  First, 
a  supervised  estimation  method,  instead  of  an  unsupervised  method,  estimates  the  conditional 
PDF  pi  from  prior  knowledge.  Second,  pi  is  estimated  as  a  mixture  of  parametric  continuous 
functions  instead  of  a  non-parametric  discrete  function.  Section  9.1  discusses  the  parametric 
density  estimation  method,  and  section  9.2  discusses  the  half-supervised  image  segmentation 
model.  Section  9.3  presents  the  numerical  algorithms  of  the  proposed  method,  and  section  9.4 
presents  the  experimental  results  applied  on  synthetic  and  real  images. 


9.1  Multivariate  Gaussian  Mixture  Density  Function  Estimated 
by  EM 

In  chapter  8,  we  proposed  to  estimate  the  conditional  PDF  pi  of  image  intensity  within  a  subset 
fti  as  a  non-parametric  discrete  function.  In  this  chapter,  we  propose  to  estimate  pi  as  a  mixture 
of  continuous  parametric  density  functions 


K 

Pi  =  ^akp(  I|flfc),  (9.1) 

k= 1 

where  the  number  of  sub-classes  K  and  the  weight  of  each  sub-class  { a determine  the  global 
statistical  property  of  a  mixture  density  function  pi,  and  the  parameter  set  6 &  determines 
the  local  statistical  property  of  a  sub-class  k.  The  PDF  of  each  sub-class  is  represented  as  a 
multivariate  Gaussian  density  function 

p(I\ek)=p(I\Hk,-Zk),\/k,  (9.2) 

1  Although  there  is  no  such  term  as  half- supervised,  we  refer  to  the  proposed  method  as  a  half-supervised 
method  because  it  requires  relatively  lower  level  of  prior  knowledge  compared  to  the  supervised  method  introduced 
in  the  next  chapter. 
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where  fik  and  respectively  denote  the  mean  vector  and  the  covariance  matrix  of  data  belong 
to  the  sub-class  k.  A  global  PDF  can  be  estimated  as  a  mixture  of  K  Gaussian  density  functions 
by  estimating  these  four  sets  of  parameters:  K ,  {ak},  and  {E&}. 

The  proposed  method  uses  the  classic  EM  method,  introduced  in  section  6.1.1,  to  estimate 
the  mixture  density  function  pi  from  given  data  samples  X,  such  as 

K 

Pi^^2akp(l\fik,tk).  (9.3) 

k= 1 

As  mentioned  in  section  6.1,  there  are  two  problems  to  estimate  a  mixture  density  function 
using  the  classic  EM  method.  First,  classic  EM  method  cannot  estimate  K.  Second,  the  result 
of  classic  EM  method  highly  depends  on  the  initial  values  of  the  parameters  to  be  estimated: 
{ak(t  =  0)},  {pk(t  =  0)},  and  =  0)}.  In  the  proposed  method,  we  solve  these  prob¬ 

lems  by  providing  partial  information  of  pi  as  a  prior  knowledge,  and  then  estimate  the  rest  of 
information  from  the  given  data  samples  and  the  prior  knowledge.  Thus,  the  parameters  are 
estimated  by  a  supervised  method  instead  of  an  unsupervised  method  as  chapter  8. 

A  total  of  four  sets  of  parameters  are  required  to  present  pi  as  a  mixture  of  Gaussian  density 
functions:  the  number  of  sub-classes  iF,  the  weight  of  each  sub-class  {<a&},  the  mean  vector 
of  each  sub-class  {//&},  and  the  covariance  matrix  of  each  sub-class  {E&}.  Let  us  assume  that 
some  training  samples  are  available  and  the  statistical  properties  of  both  the  training  sam¬ 
ples  and  test  data  are  similar.  We  can  find  the  expected  values  of  K  and  {pk}  from  training 
samples  and  provide  them  as  the  prior  knowledge  of  supervised  density  estimation.  Then,  the 
proposed  EM  method  needs  to  estimate  only  {ak}  and  {E&}  from  the  test  data.  Figure  9.1 
shows  an  example  of  mixture  density  estimation  using  the  proposed  method.  The  blue  solid 


Figure  9.1:  Density  estimation  using  a  supervised  EM  method 

line  presents  the  histogram  of  data  samples,  i.e.  the  true  statistical  distribution  of  the  given 
data.  There  are  three  peaks  in  the  histogram,  thus  the  PDF  of  the  given  data  should  be  rep¬ 
resented  as  a  mixture  of  at  least  three  Gaussian  density  functions.  Let  us  assume  that  two 
information  are  available:  the  expected  number  of  sub-classes  K  —  3  and  the  expected  values 
of  the  mean  of  each  sub-class  {/iq,  /i2,  p 3}  shown  as  the  three  red  circles.  Then,  the  classic  EM 
method  can  estimate  the  weights  of  each  sub-class  and  the  covariance  matrix  of 

each  sub-class  {dq,  oq,  63}  from  the  given  data.  The  red  dotted  line  presents  the  estimated  para¬ 
metric  mixture  density  function.  The  mathematic  description  of  the  general  EM  method  was 


60 


described  in  section  6.1.1,  and  the  algorithm  of  the  proposed  method  is  presented  in  section  9.3. 


9.2  Half-supervised  Image  Segmentation 

The  proposed  segmentation  method  consists  of  two  stages:  training  and  test.  During  the  train¬ 
ing  stage,  a  low  level  of  prior  knowledge  is  provided.  The  test  stage  consists  of  two  procedures. 
First,  we  estimate  the  statistical  distribution  of  image  intensity  within  each  subset  as  a  mixture 
of  continuous  parametric  density  functions.  Then,  we  partition  an  image  using  the  same  energy 
minimization  based  image  segmentation  principle  used  in  the  unsupervised  multi-dimensional 
histogram  method. 

The  same  global  energy  function  introduced  in  section  8.1 

E({Pi},C)  =  T  [  e(x,y\pi)dxdy  +  v\C\  (9.4) 

is  minimized  with  respect  to  two  expressions:  {p^},  i.e.  the  representation  of  the  image  within 
a  subset  and  the  variational  contours  C.  The  objective  function  e(x,y\pi)  determines  the 
region-based  segmentation  criteria  based  on  p^,  and  the  best  pi  satisfying  the  segmentation 
criteria  is  a  conditional  PDF  of  image  intensity  on  the  condition  that  the  image  pixel  is  an 
element  of  f^,  i.e.  p(I(x,y)\(x,y)  E 

In  the  proposed  method,  pi  is  represented  by  four  sets  of  parameters:  the  number  of  sub¬ 
classes  iF,  the  weight  of  each  sub-class  the  mean  vector  of  each  sub-class  and 

the  covariance  matrix  of  each  sub-class  {£&}•  Before  the  image  segmentation,  i.e.  during  the 
training  stage,  the  expected  values  of  K  and  {/xk}  are  provided  from  training  samples.  Dur¬ 
ing  image  segmentation,  i.e.  the  test  stage,  {ak}  and  {E&}  of  image  intensity  within  are 
estimated  by  the  classic  EM  method  as  described  in  the  previous  section.  The  given  param¬ 
eters,  K  and  {£ik},  and  the  estimated  parameters,  {ak}  and  {E&},  represent  the  conditional 
PDF  pi  of  image  intensity  at  each  subset,  and  the  estimated  pi  provides  a  force  to  propagate 
the  variational  contours  C  to  the  proper  position  where  C  divides  the  entire  domain  D  of  the 
image  I (x,y)  into  multiple  subsets  {f^}.  Figure  9.2  and  9.3  show  an  example  of  the  proposed 
half-supervised  segmentation.  Figure  9.2(a)  shows  an  RGB  image  consisting  of  two  military 
camouflage  patterns  where  each  of  the  pattern  consists  of  at  least  three  different  paints,  and 
figure  9.2(b)  shows  the  two  subsets  of  the  given  image  as  {Di,!^}.  Since  both  camouflage 
patterns  consist  of  three  different  paints,  the  PDF  of  image  intensity  for  each  subset  should  be 
a  mixture  of  at  least  three  sub-classes.  Figure  9.3  shows  the  example  of  training  stage  of  the 
proposed  method.  The  vector- valued  image  intensities  at  six  points  are  measured  from  training 
samples,  and  assigned  as  {/xl5  /x2,  AG>  AG,  AG,  AG}  f°r  background  in  figure  9.3(a).  Three 
samples  are  measured  from  training  samples,  and  assigned  as  {ag,  AG>  AG}  f°r  rectangle  D2 
in  figure  9.3(b).  During  the  image  segmentation,  {ak}  and  {E&}  are  estimated  for  both  subsets 
{Qi,  D2}  from  the  given  expected  values  of  K  and  {ftk}  and  the  image  intensity  of  each  subset 
{I(x,y)\\/(x,y)  G 

The  minimization  of  E  with  respect  to  C  smoothes  the  variational  boundaries  C  minimizing 
the  length  \C\.  We  use  the  same  level  set  theory  introduced  in  section  8.3  to  implement  the 
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(a)  (b) 


Figure  9.2:  A  complicated  synthetic  RGB  image:  (a)  an  RGB  image  with  two  camouflage 
patterns,  (b)  the  ground  truth  image 


(a) 

Figure  9.3:  Training  stage  of  the  proposed  method:  (a)  6  samples  measured  for  background, 
(b)  3  samples  measured  for  the  rectangle 

proposed  active  contour  model. 

The  proposed  half-supervised  image  segmentation  method  performs  better  than  the  unsu¬ 
pervised  image  segmentation  method  introduced  in  the  last  chapter  particularly  if  the  image 
has  distinctive  non-uniform  sub-regions  as  shown  in  figure  9.3.  Since  the  statistical  property  of 
each  subset  must  be  represented  as  a  mixture  density  function,  the  segmentation  of  the  images 
like  figure  9.3  is  extremely  difficult  for  unsupervised  segmentation  methods.  The  supervised 
density  estimation  improves  the  classic  EM  method  providing  more  robust  segmentation  capa¬ 
bility,  and  the  parametric  mixture  density  function  provides  more  flexibility  than  the  histogram 
density  function. 


9.3  Algorithm 

Table  9.1  shows  the  input  and  output  variables  used  in  the  proposed  segmentation  method. 
A  multispectral  image  I(x,  y)  is  given  as  the  input  data.  The  number  of  level  set  functions, 
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Table  9.1:  The  input  and  output  variables  used  in  the  half-supervised  multivariate  Gaussian 
mixture  density  method 

•  Input 

—  I (x,y)  :  ft  —>  3?^,  a  multispectral  image. 

—  J,  the  number  of  level  set  functions  {<fij(x,  y)}  =  the  number  of  active 
contours  {Cj}. 

—  v,  a  parameter  to  control  the  regularity  of  contours. 

—  At,  the  time  step  of  iterative  processing. 

-  {TiM,--  ,X2j},  training  samples  for  each  subset. 

•  Output 

—  S(x,  y)  :  ft  —>  { 1, . . . ,  2J},  the  segmented  image 


which  is  equivalent  to  the  number  of  contours,  is  set  as  J.  v  controls  the  regularity  of  contour 
evolution  by  controlling  the  relative  balance  between  segmentation  force  and  regularity  force. 
At  determines  the  convergence  speed.  The  expected  number  K  of  sub-classes  and  the  expected 
mean  vectors  {fik}  of  sub-classes  for  each  subset  is  measured  from  the  training  samples  li  for 
each  subset  and  they  provide  prior  information  on  supervised  density  estimation  for  {ak} 
and  {E&}.  After  the  proposed  active  contours  converge,  the  algorithm  produces  the  segmented 
image  S(x,y)  indicating  each  subset  with  different  numbers. 

The  proposed  active  contour  algorithm  consists  of  three  stages:  training,  initialization,  and 
contour  evolution  (test).  During  the  training  stage,  the  expected  number  K  of  sub-classes 
and  the  expected  mean  vectors  {fik}  of  sub-classes  are  measured  from  the  training  samples 
li  and  assigned  for  each  subset  fti.  The  description  of  function  priorlnfo(-)  is  presented  in 
section  9.2.  During  the  initialization  stage,  J  sets  of  initial  contours  {Cj}  are  given  either  man¬ 
ually  or  from  a  defined  set,  and  corresponding  level  set  functions  are  initialized  as  the  signed 
distance  from  each  (x,y)  to  the  closest  contours.  During  the  iterative  contour  evolution,  the 
mean  curvature  Kj(x,y)  and  the  unit  step  function  Hj(x,y)  are  computed  from  each  cj)j(x,y). 
The  mathematical  description  of  function  meanCurvature(-)  and  unitStep(-)  are  introduced 
respectively  in  equation  3.10  and  7.3.  Then,  binary  identity  functions  Xi{xi  y)  for  each  subset  fti 
are  computed.  The  mathematical  description  of  function  identityFunction(-)  is  introduced 
in  equation  7.4.  The  weights  {ak}  and  covariance  matrix  {E&}  of  sub-classes  are  estimated 
from  the  given  {pik}  and  the  image  intensity  measured  within  each  subset  I (x,y)xi(x,y)  using 
the  classic  EM  method.  The  mathematical  description  of  function  EM(-)  is  introduced  in  sec¬ 
tion  6.1.1,  and  the  numerical  algorithm  is  presented  in  algorithm  4.  The  estimated  and 
{E&}  and  the  given  {fik}  determine  the  conditional  PDF  of  image  intensity  and  pi  updates 
the  objective  function  e^(x,p)  for  fti.  The  level  set  functions  {(f)j(x,y)}  are  finally  updated 
based  on  Xi(x,y)  and  ei(x,y)- 
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Algorithm  3:  Half-supervised  Active  Contour  Model  using  Multivariate  Gaussian  Mix¬ 
ture  Density  Functions 


Training 

for  i  —  1  •  •  •  2 J  do 

;  [K,  ~  priorlnf  o(Xj):  The  expected  number  and  the  means  of  sub-classes 

Initialization 
t  —  0. 

for  j  =  1  •  •  •  J  do 

Cj(x(s),y(s),t  =  0):  Initialize  contours. 

4>j(x,y,t  —  0)  =  ±D((x,y),N(Xiy}(Cj)):  Initialize  level  set  functions. 

Contour  evolution 
repeat 

for  j  =  1  •  •  •  J  do 

Kj(x,y)  =  meanCurvatur  e((j>j(x,y)) 

L  Hj(x>y)  =  unitStep  (<f>j(x,y)) 

Half-supervised  region-based  segmentation 
for  i  —  1  •  •  •  2 J  do 

Xi(z,y)  =  identityFunction ({Hj(x,y),  Vj}) 

[{afc},{Sfc}]i  =  EM(I(x,y)xi(a:,y)|[i^,{Afc}Ji) 

L  e*(z,  y)  =  -  iogp(i(a?,  y)  I  [K,  {®k},  {Afc},  {Xfc}] i) 

Update  level  set  function 
for  j  =  1  •  •  •  J  do 

r  ~ 

<t>j(x,y)  =  4>j{x,y)  +  AM(<^(a;,y))  UK((f>j(x,y))  -  X  ei(x,y)j$: 


I  £  =  t  +  At 

until  until  the  evolution  of  contour  converges 

Segmentation  result 

2J 

S(x,y)  =  X  XiO>y) 

2=1 

Table  9.2  shows  the  input  and  output  variables  used  in  the  function  EM(Z|iF,  {/xfe}).  The 


Table  9.2:  The  input  and  output  variables  used  in  the  estimation  of  a  mixture  of  multivariate 
Gaussian  density  functions  using  a  classic  EM  method 
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supervised  EM  method  takes  three  inputs,  data  samples  X,  the  number  of  sub-classes  K ,  and 
the  mean  vector  of  each  sub-class  and  produces  two  outputs,  the  estimated  weight  {ak} 

and  covariance  matrix  {E&}  of  each  sub-class. 

The  proposed  EM  method  estimates  the  output  values  through  an  iterative  process  consist¬ 
ing  of  three  steps:  expectation  (E-step),  maximization  (M-step),  and  evaluation.  During  E-step, 
the  expected  posterior  probabilities  of  the  given  data  samples  based  on  the  current  and  given 
fik  for  each  sub-class  k  are  estimated  and  updated.  During  M-step,  two  output  parameters,  ak 
and  Efc,  which  maximize  the  likelihood  function  Q(t),  are  estimated  for  each  sub-class  k.  Dur¬ 
ing  the  evaluation  step,  we  check  the  value  of  Q(£),  and  stop  the  iteration  if  Q(t)  has  converged. 


Algorithm  4:  Half-supervised  classic  EM  Algorithm 

Initialization 
for  k  =  1  •  •  •  K  do 

otk  =  1/^ 

identity  matrix 
Q(0)  oo,  A Q  —  oo 


Estimation 

while  A Q  >  e  do 

E-step:  update  posterior  probabilities, 

for  k  =  1  •  •  •  K  do 

L  for  n  =  1  •  •  •  TV  do  uk(n)  =  p(I(n)\pk,  tk) 
for  k  =  1  •  •  •  K  do 

|_  for  n  =  1  •  •  •  N  do  wk(n)  =  akuk(n)/ XjL  1  <*jUj(n) 

M-step:  update  parameter  set  ©  =  [{d&},  {E^}] . 

for  k  =  1  •  •  •  K  do  ak  =  ±  J2n=i  wk(n ) 

{®fc}  =  oik/  Vj-1  oiji 

for  k  =  1  •  •  •  K  do 


E?=i  ^(n)(I(n)-Mfc)(I(n)-Mfc)T 

E^=i  wk(n) 


Evaluation:  check  the  criterion  to  stop  the  iteration. 

Qif)  =  jf  Xn=i  Ef=  l  wk{n )  log  akuk{n) 

A Q  =  | Q(t)  -  Q(t  -  1)| 
t  =  t  +  1 


9.4  Experiments 

In  this  section,  we  apply  the  proposed  segmentation  method  to  four  different  images,  and  com¬ 
pare  the  results  of  the  proposed  method  to  the  results  of  other  active  contour  models.  The 
same  region-based  active  contour  models  used  in  the  last  chapter  were  used  as  benchmarks. 
All  three  methods  including  the  proposed  method  determine  the  segmentation  criteria  based 
on  the  statistical  property  of  image  intensity  measured  within  a  subset.  Method  1  and  2  are 
unsupervised  methods,  while  the  proposed  method  uses  prior  knowledge  from  the  given  training 
samples.  The  experiments  of  all  three  methods  use  the  same  pre-defined  initial  contours.  The 
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detail  of  mathematical  and  analytical  descriptions  of  both  benchmarks  methods  are  presented 
in  section  3.4. 

Figure  9.4  shows  an  example  of  synthetic  RGB  images  with  complicated  non-uniform  pat¬ 
terns.  There  is  a  rectangle  in  the  middle  of  the  image  in  figure  9.4(a).  Figure  9.4(b)  shows  the 


(a)  (b) 


Figure  9.4:  A  complicated  synthetic  RGB  image:  (a)  an  RGB  image  with  two  camouflage 
patterns,  (b)  the  ground  truth  image 

ground  truth  information  of  the  given  image.  The  object  of  this  experiment  is  to  separate  the 
rectangle,  class  2,  from  the  background,  class  1.  This  is  a  difficult  segmentation  problem  be¬ 
cause  the  statistical  distribution  of  both  the  rectangle  and  the  background  are  complicated  due 
to  the  military  camouflage  patterns  consisting  of  multiple  paints.  Also,  the  image  is  manipu¬ 
lated,  so  the  mean  vectors  of  both  interior  and  exterior  are  identical  as  [R,  G,  B\  —  [163, 164,  77]. 

Figure  9.5  shows  the  result  of  method  1  applied  on  figure  9.4.  Figure  9.5(a)  shows  the 


(a)  (b) 

Figure  9.5:  Method  1  applied  to  a  complicated  synthetic  RGB  image:  (a)  the  final  stage  of 
contour  evolution,  (b)  the  segmented  subsets 

final  stage  of  contour  evolution,  and  figure  9.5(b)  shows  the  corresponding  segmentation  result. 
Method  1  cannot  separate  the  rectangle  as  a  single  object  because  it  separates  the  rectangle 
as  two  sub-regions,  bright  and  dark,  and  the  same  effect  also  appears  on  the  background.  The 
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dark  sub-class  and  the  bright  sub-class  of  the  rectangle  stay  too  far  sway  in  the  image  intensity 
space,  so  the  average  value  of  the  intensity  cannot  represent  the  subset.  The  same  problem 
appears  on  the  background. 

Figure  9.6  shows  the  result  of  method  2  applied  on  figure  9.4.  Figure  9.6(a)  shows  the 


(a)  (b) 

Figure  9.6:  Method  2  applied  to  a  complicated  synthetic  RGB  image:  (a)  the  final  stage  of 
contour  evolution,  (b)  the  segmented  subsets 

final  stage  of  contour  evolution,  and  figure  9.6(b)  shows  the  corresponding  segmentation  result. 
Method  2  cannot  separate  the  rectangle  as  a  single  object  because  of  the  same  reason  that 
method  1  fails.  Experiments  shown  in  figure  9.5  and  9.6  show  the  limitation  of  unsupervised 
segmentation  methods  on  the  images  with  non-uniform  sub-regions. 

We  applied  the  proposed  half-supervised  segmentation  method  on  the  same  image.  Fig¬ 
ure  9.7  shows  the  training  stage  of  the  proposed  method.  Training  samples  for  both  the 


(a)  (b) 

Figure  9.7:  Training  stage  of  the  proposed  method:  (a)  6  samples  measured  for  background, 
(b)  3  samples  measured  for  the  core 

rectangle  and  the  background  are  provided.  Six  vector-valued  image  intensities  are  measured 
from  the  training  samples  and  assigned  as  {/i1,  /x2, . . . ,  /x6}  for  the  background  in  figure  9.7(a). 
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Three  vector-valued  image  intensities  are  measured  from  the  training  samples  and  assigned  as 
{A:l,A2>  A 3}  f°r  the  rectangle  in  figure  9.7(b).  Figure  9.8  shows  the  the  result  of  the  proposed 
method.  Figure  9.8(a)  shows  the  final  stage  of  contour  evolution,  and  figure  9.8(b)  shows  the 


(a)  (b) 

Figure  9.8:  Proposed  method  applied  to  a  complicated  synthetic  RGB  image:  (a)  the  final  stage 
of  contour  evolution,  (b)  the  segmented  subsets 

corresponding  segmentation  result.  The  proposed  method  successfully  separates  the  rectangle 
from  the  background  without  any  significant  artifact. 

For  more  precise  comparison,  the  statistical  distribution  of  image  intensity  within  both  class 
1  and  2  are  measured.  Figure  9.9  shows  the  statistical  distribution  of  image  intensity  measured 
at  green  channel  within  class  1,  which  is  supposed  to  be  the  background  in  the  segmentation 
result.  Figure  9.9(a),  9.9(b),  and  9.9(c)  respectively  present  the  results  of  method  1,  method 
2,  and  the  proposed  method.  The  blue  solid  lines  of  all  three  graphs  present  the  statistical 
distribution  of  image  intensity  measured  at  the  background,  and  it  shows  that  the  camouflage 
pattern  of  background  consists  of  at  least  three  different  paints.  The  green  solid  lines  of  all 
three  graphs  present  the  same  statistics  measured  within  class  1  in  the  segmentation  results. 
As  the  green  solid  line  exists  closer  to  the  blue  solid  line,  it  presents  better  result.  In  fig¬ 
ure  9.9(a),  the  dotted  red  line  with  a  circle  presents  the  representation  of  the  subset  used  in 
method  1,  i.e.  the  mean  of  green  solid  line.  The  measured  statistics  are  very  different  from  the 
ground  truth  because  the  constant  representation  is  not  sufficient  to  represent  the  non-uniform 
sub- region  consisting  of  three  different  paints.  In  figure  9.9(b),  the  dotted  red  line  presents 
the  representation  of  the  subset  used  in  method  2,  i.e.  the  Gaussian  density  function  of  image 
intensity  measured.  The  measured  statistics  is  very  different  from  the  ground  truth  because 
of  the  same  reason  as  figure  9.9(a).  The  red  circles  in  figure  9.9(c)  present  {/xl5  /x2,  •  •  • ,  Ael 
measured  from  training  samples.  They  are  the  centers  of  each  sub-class.  The  dotted  red  line 
presents  the  estimated  parametric  mixture  density  function  at  the  final  stage  of  contour  evo¬ 
lution.  Although  the  estimated  mixture  density  function  is  not  identical  to  the  true  statistical 
distribution,  i.e.  the  blue  solid  line,  it  successfully  finds  the  position  of  peaks  producing  a  good 
segmentation  result.  The  estimated  weights  {di,  . . . ,  &q}  help  to  ignore  insignificant  sub¬ 
classes  and  to  emphasize  significant  sub-classes.  The  measured  statistics  and  the  ground  truth 
are  almost  identical.  Figure  9.10  shows  the  same  measure  for  class  2,  the  rectangle.  Both 
method  1  and  2  cannot  produce  satisfying  results,  while  the  proposed  method  successfully  es¬ 
timates  a  mixture  density  function,  which  is  close  to  the  true  statistical  distribution  of  image 
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Figure  9.9:  Statistical  distribution  of  image  intensity  within  the  background  measured  at  green 
channel:  (a)  method  1,  (b)  method  2,  (c)  proposed  method 


intensity,  from  {/i1,  /h2,  A 3}.  The  measured  statistics  and  the  ground  truth  are  almost  identical. 

Figure  9.11  shows  an  RGB  image  captured  from  a  virtual  3D  scene  generated  by  an  ATR 
simulator.  The  background  is  reconstructed  from  an  air-borne  range  data,  and  the  correspond¬ 
ing  color  picture  is  put  on  the  reconstructed  surface  as  texture.  We  also  reconstruct  a  tank 
model,  and  put  the  camouflage  pattern  shown  in  figure  9.7(b)  on  its  surface.  The  object  of 
this  experiment  is  to  extract  the  shape  of  the  tank  from  the  scene  as  a  target.  This  image  is 
another  difficult  problem  because  there  are  a  lot  of  shades  on  the  surface  of  both  the  tank  and 
the  background  as  well  as  the  complicated  camouflage  pattern. 

Figure  9.12  shows  the  result  of  method  1  applied  on  figure  9.11.  Figure  9.12(a)  shows  the 
final  stage  of  contour  evolution,  and  figure  9.12(b)  shows  the  corresponding  segmentation  result. 
Method  1  identifies  the  sky  as  the  target,  the  white  subset.  Also,  method  1  cannot  extract  the 
whole  body  of  the  tank  from  the  scene  because  of  the  complicated  camouflage  pattern.  Method 
2  also  produces  an  undesirable  result  as  shown  in  figure  9.13.  The  mountains  and  other  areas 
in  the  background  are  identified  as  target,  the  dark  subset. 

We  applied  the  proposed  half-supervised  segmentation  method  on  the  same  image.  Fig¬ 
ure  9.14  shows  the  training  stage  of  the  proposed  method.  We  used  the  same  image  as  the 
training  samples  for  this  experiment.  Three  vector- valued  image  intensities  are  measured  from 
a  mountain,  sky  and  the  field,  and  assigned  for  the  background.  Six  vector-valued  image  in- 
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Figure  9.10:  Statistical  distribution  of  image  intensity  within  the  core  measured  at  green  chan¬ 
nel:  (a)  method  1,  (b)  method  2,  (c)  proposed  method 


Figure  9.11:  A  complicated  synthetic  RGB  image 


tensities  are  measured  from  the  tank,  and  assigned  for  the  tank.  After  applying  the  proposed 
method  on  the  same  image,  we  successfully  extract  the  almost  perfect  shape  of  the  tank  from 
the  background  as  shown  in  figure  9.15.  Figure  9.15(a)  shows  the  final  stage  of  contour  evolu¬ 
tion,  and  figure  9.15(b)  shows  the  corresponding  segmentation  result.  In  figure  9.15(b),  we  can 
see  very  sophisticated  detail  of  the  tank.  This  experiment  shows  the  robustness  of  the  proposed 
method  in  ATR  problems. 

Figure  9.16  shows  an  example  of  outdoor  gray  scale  images  with  texture.  There  are  two 
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(a)  (b) 

Figure  9.12:  Method  1  applied  to  a  complicated  synthetic  RGB  image:  (a)  the  final  stage  of 
contour  evolution,  (b)  the  segmented  subsets 


(a)  (b) 

Figure  9.13:  Method  2  applied  to  a  complicated  synthetic  RGB  image:  (a)  the  final  stage  of 
contour  evolution,  (b)  the  segmented  subsets 


(a)  (b) 

Figure  9.14:  Training  stage  of  the  proposed  method:  (a)  3  samples  measured  for  background, 
(b)  6  samples  measured  for  the  tank 


zebras  standing  close  on  the  field.  The  zebras  consist  of  white  and  black  stripes,  while  the 
background  is  textured  gray.  The  object  of  this  experiment  is  to  extract  the  shape  of  zebras 
from  the  background.  This  is  a  difficult  segmentation  problem  because  of  the  complicated 
patterns  of  zebras.  The  white  stripes  of  zebras  are  brighter  than  the  background,  while  the 
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(a)  (b) 

Figure  9.15:  Proposed  method  applied  to  a  complicated  synthetic  RGB  image:  (a)  the  final 
stage  of  contour  evolution,  (b)  the  segmented  subsets 


Figure  9.16:  A  complicated  outdoor  gray  image;  two  zebras 


black  stripes  of  zebras  are  darker  than  the  background.  Also,  some  portions  of  zebras,  e.g.  white 
stripes  under  the  shade,  have  similar  intensity  to  the  background.  The  statistical  distribution  of 
image  intensity  within  the  zebras  must  be  represented  as  a  mixture  of  more  than  two  sub-classes. 

Figure  9.17  shows  the  result  of  method  1  applied  on  figure  9.16.  Figure  9.17(a)  shows  the 
final  stage  of  contour  evolution,  and  figure  9.17(b)  shows  the  corresponding  segmentation  re¬ 
sult.  Although  method  1  identifies  the  whole  background  as  a  subset,  it  separates  the  zebras 
into  two  parts:  bright  sub-regions  and  dark  sub-regions.  Also,  method  1  identifies  the  bright 
sub-regions  of  zebras  as  the  background. 

Figure  9.18  shows  the  result  of  method  2  applied  on  figure  9.16.  Figure  9.18(a)  shows  the 
final  stage  of  contour  evolution,  and  figure  9.18(b)  shows  the  corresponding  segmentation  re¬ 
sult.  Although  method  2  produces  better  segmentation  result  than  method  1,  there  are  many 
artifacts  on  both  zebras  and  background.  The  sub-regions  of  zebras  whose  intensity  is  similar 
to  the  background  are  identified  as  background,  the  white  subset,  while  the  sub-regions  of  back¬ 
ground  whose  intensity  is  darker  than  the  other  parts  of  background  are  identified  as  zebras, 
the  black  subset. 
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(a)  (b) 

Figure  9.17:  Method  1  applied  to  the  zebra  image:  (a)  the  final  stage  of  contour  evolution,  (b) 
the  segmented  subsets 


(a)  (b) 

Figure  9.18:  Method  2  applied  to  the  zebra  image:  (a)  the  final  stage  of  contour  evolution,  (b) 
the  segmented  subsets 


Figure  9.19  shows  the  training  stage  of  the  proposed  segmentation  method.  We  used  the 
same  image  as  the  training  samples  for  this  experiment.  The  image  intensities  of  two  sam¬ 
ples  are  measured  and  assigned  as  {A15A2}  f°r  the  background.  Four  image  intensities  from 
bright  white  stripes,  dark  white  stripes,  bright  black  stripes,  and  dark  black  stripes  are  mea¬ 
sured  and  assigned  as  {/i1,  /x2,  A35  A4}  f°r  the  zebras.  After  applying  the  proposed  method  on 
the  same  image,  the  proposed  method  successfully  extracts  the  shape  of  two  zebras  from  the 
background  as  shown  in  figure  9.20.  Figure  9.20(a)  shows  the  final  stage  of  contour  evolution, 
and  figure  9.20(b)  shows  the  corresponding  segmentation  result.  Although  there  are  still  a  few 
artifacts  on  the  zebras,  the  number  of  artifacts  has  been  significantly  reduced,  and  there  are  no 
artifacts  in  the  background.  The  artifacts  found  within  zebras  appear  because  the  image  inten¬ 
sity  of  those  sub-regions  is  identical  to  the  background.  This  experiment  shows  the  robustness 
and  the  limitation  of  the  proposed  segmentation  method. 

Figure  9.21  shows  an  example  of  indoor  RGB  images  with  a  strong  texture.  There  is  a 
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(a)  (b) 

Figure  9.19:  Training  stage  of  the  proposed  method:  (a)  2  samples  measured  for  background, 
(b)  4  samples  measured  for  zebras 


(a)  (b) 

Figure  9.20:  Proposed  method  applied  to  the  zebra  image:  (a)  the  final  stage  of  contour 
evolution,  (b)  the  segmented  subsets 


Figure  9.21:  A  complicated  indoor  RGB  image;  hand 
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human  hand  and  a  donut-shaped  object  on  the  texture  background.  The  object  of  this  experi¬ 
ment  is  to  extract  the  hand  and  the  donut-shaped  object  from  the  background  and  to  identify 
them  as  different  subsets.  Thus,  the  ideal  segmentation  result  should  have  three  subsets  as 
{^hand^object ^background}-  This  is  a  difficult  segmentation  problem  because  it  requires  not 
only  segmentation  but  also  pattern  classification.  The  textured  background  is  another  difficult 
problem.  In  order  to  partition  the  given  image  into  three  subsets,  all  three  active  contour  mod¬ 
els  use  two  level  set  functions  {<fii(x,  y),  y)}.  The  mathematical  description  of  4-phase 

active  contour  is  presented  in  chapter  7. 

Figure  9.22  shows  the  result  of  method  1  applied  on  figure  9.21.  Figure  9.22(a)  shows  the 


(a)  (b) 

Figure  9.22:  Method  1  applied  to  the  hand  image:  (a)  the  final  stage  of  contour  evolution,  (b) 
the  segmented  subsets 

final  stage  of  contour  evolution,  and  figure  9.22(b)  shows  the  corresponding  segmentation  result. 
Method  1  produces  a  very  poor  segmentation  result  in  this  case.  Because  of  the  texture,  a  lot  of 
artifacts  appear  on  the  background.  Also,  the  use  of  two  level  set  functions  produces  undesired 
local  minima  resulting  in  a  lot  of  artifacts  on  the  hand. 

Figure  9.23  shows  the  result  of  method  2  applied  on  figure  9.21.  Figure  9.23(a)  shows  the  fi- 


(a)  (b) 

Figure  9.23:  Method  2  applied  to  the  hand  image:  (a)  the  final  stage  of  contour  evolution,  (b) 
the  segmented  subsets 

nal  stage  of  contour  evolution,  and  figure  9.23(b)  shows  the  corresponding  segmentation  result. 
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Although  method  2  produces  a  relatively  cleaner  segmentation  on  the  textured  background, 
it  generates  a  lot  of  artifacts  on  the  background,  which  is  the  local  minima  of  two  level  set 
functions.  Method  2  also  identifies  the  hand  and  the  donut-shaped  object  as  the  same  subset. 
A  smart  choice  of  initial  contours  may  avoid  the  artifacts  caused  by  local  minima  of  two  level 
set  functions,  but  it  requires  prior  knowledge  of  given  image. 

Figure  9.24  shows  the  training  step  of  the  proposed  segmentation  method.  We  used  the  same 


(a)  (b) 

Figure  9.24:  Training  stage  of  the  proposed  method:  (a)  3  samples  measured  for  hand,  (b)  2 
samples  measured  for  the  donut  shaped  object 

image  as  the  training  samples  for  this  experiment.  The  vector- valued  image  intensities  of  three 
samples  are  measured  from  hand  and  assigned  as  {/xl9  /x2,  A3}  f°r  the  hand.  Two  samples  are 
measured  from  the  donut-shaped  object,  and  assigned  as  {/xl5  fi2}  for  the  object.  For  the  back¬ 
ground,  we  did  not  provide  any  information,  but  assumed  the  background  can  be  represented  as 
a  multivariate  Gaussian  density  function.  After  applying  the  proposed  method  on  the  same  im¬ 
age,  the  proposed  method  successfully  extracts  the  shape  of  hand  and  the  donut-shaped  object 
from  the  background,  and  identifies  them  as  three  different  subsets  {Qhand,  ^ object 5  ^ background } 
as  shown  in  figure  9.25.  Although  there  are  still  a  few  artifacts,  identified  as  Q unknown >  the 


(a)  (b) 

Figure  9.25:  Proposed  method  applied  to  the  hand  image:  (a)  the  final  stage  of  contour  evolu¬ 
tion,  (b)  the  segmented  subsets 

proposed  method  shows  the  best  result  compared  to  the  benchmarks.  This  experiment  shows 
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that  the  proposed  method  can  be  used  for  not  only  image  segmentation  but  also  pattern  clas¬ 
sification. 
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Chapter  10 

Supervised  Active  Contour  Model 
using  Multivariate  Gaussian 
Mixture  Density  Functions 


In  this  chapter,  we  propose  a  supervised  active  contour  model  using  the  parametric  density  es¬ 
timation  method  introduced  in  section  6.1.2.  The  proposed  active  contour  model  estimates  the 
statistical  information  of  training  samples  for  each  subset,  and  applies  the  pre-estimated  infor¬ 
mation  into  image  segmentation  without  iterative  density  estimation  during  contour  evolution. 
Section  10.1  discusses  the  supervised  image  segmentation  model,  and  section  10.2  discusses  the 
parametric  density  estimation  method.  Section  10.3  presents  the  numerical  algorithms  of  the 
proposed  method,  and  section  10.4  presents  the  experimental  results  applied  on  synthetic  and 
real  images. 


10.1  Supervised  Image  Segmentation 

The  proposed  segmentation  method  consists  of  two  stages:  training  and  test  stages.  During  the 
training  stage,  an  advanced  EM  method  with  MML  estimates  the  PDF  of  training  samples  as 
a  mixture  of  multivariate  Gaussian  density  functions.  During  the  test  stage,  the  proposed  seg¬ 
mentation  method  assigns  the  pre-estimated  density  functions  as  the  conditional  PDF  of  image 
intensity  for  each  subset,  and  partitions  the  given  image  without  any  density  estimation  process. 

Let  us  assume  that  training  samples  for  each  subset  are  available  and  the  statistical  dis¬ 
tribution  of  training  samples  is  similar  to  the  statistical  distribution  of  image  intensity  within 
each  subset  in  the  image.  Then,  we  can  estimate  the  conditional  PDF  of  image  intensity  from 
the  training  samples,  such  as 

p{I(x,y)\(x,y)  Vi  (10.1) 

where  li  =  {Ii, . . . ,  In, . . . ,  In}  denotes  the  ideally  sampled  training  samples  or  a  statistical 
template  of  the  object,  which  is  supposed  to  be  isolated  as  a  subset  in  the  result  of  the 
segmentation.  p(l\I  G  li)  :  ?R:B  — ►  5?  denotes  a  multivariate  conditional  PDF  of  a  vector-valued 
image  intensity  I  on  the  condition  that  the  multispectral  image  pixel  is  an  element  of  training 
samples  li.  The  detail  on  the  density  estimation  of  p(I|I  G  li)  will  be  discussed  in  the  next 
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section. 


The  proposed  segmentation  method  uses  the  same  global  energy  minimization  scheme  as 
the  other  two  methods  introduced  in  chapter  8  and  9.  However,  the  minimization  of  the  global 
energy  function  E  with  respect  to  the  statistical  representation  pi  in  equation  9.4  and  8.1  of 
each  subset  is  not  necessary  in  the  proposed  method  because  the  best  fitting  pi  is  already  given 
as  p( I|I  Gl*)  from  training  samples  instead  of  the  given  image.  The  global  energy  function  can 
be  simplified  as  a  function  of  variational  contours  C  only,  such  as 

E{C)  =  ^2  (  -  log p(I(x,  y)\I(x,  y)  eli)dxdy  +  is\C\,  (10.2) 

* 

where  the  pre-estimated  p(I(x,y)\I(x,y)  £  T,)  determines  the  segmentation  criteria  and  pro- 
vides  a  force  to  propagate  the  variational  contours  C  to  the  proper  position  where  C  divides 
the  entire  domain  D  of  the  image  I(x,  y )  into  multiple  subsets  This  is  a  supervised  image 

segmentation.  Then,  the  minimization  of  E  with  respect  to  C  smoothes  the  variational  bound¬ 
aries  C  minimizing  the  length  \C\. 


The  Euler-Lagrange  equation  obtained  by  minimizing  the  energy  function  E  with  respect 
to  level  set  functions  =  {</> i, . . . ,  </>j, . . . ,  cf)j}  is  given  by 


d^jpc/y) 

dt 


VK, 


+Elogp(I(:r’  y)\l(x,y)  e  li 


2=1 


dXi 

dHn 


,  Vj, 


(10.3) 


where  Sj  =  5e((j)j(x,y))  controls  the  width  of  update  d(j)j/dt ,  p(I(x,  y)\I(x,  y)  G  li)  provides 
a  force  to  propagate  contours  Cj,  and  dxi/dHj  determines  the  region  where  4>j  updates. 
The  implementation  of  the  proposed  active  contour  model  is  identical  to  the  two  other  pro¬ 
posed  active  contour  models  except  that  the  proposed  method  does  not  need  to  compute 
p(I(x,y)\I(x,y)  G  li)  at  each  iteration. 


The  proposed  active  contour  model  estimates  the  PDF  of  training  samples  prior  to  the  con¬ 
tour  evolution,  and  does  not  estimate  the  PDF  from  the  image  but  only  apply  the  pre-estimated 
PDF  for  image  segmentation.  This  supervised  approach  simplifies  the  segmentation  procedure 
reducing  the  convergence  time  of  active  contours.  As  long  as  the  training  samples  are  reliable, 
the  segmentation  results  will  be  also  more  robust  than  the  results  of  the  two  methods  intro¬ 
duced  in  chapter  8  and  9  as  well  as  traditional  segmentation  methods.  However,  the  results 
will  not  consistent  or  robust  if  the  training  samples  are  unreliable. 


10.2  Multivariate  Gaussian  Mixture  Density  Function  Estimated 
by  EM  with  MML 

In  this  chapter,  we  represent  the  conditional  PDF  of  vector-valued  image  intensity  for  being 
an  element  of  a  subset  p(I(x,y)\(x,y)  G  Qi)  in  the  same  way  that  we  estimate  pi  in  chapter  9, 
i.e.  a  mixture  of  multivariate  Gaussian  density  functions 

K 

p(l(x,y)\(x,y)  eCli)  =  ^2akp(l(x,y)\nk,Ek),  \/i,  (10.4) 

k= 1 
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except  that  the  mixture  density  function  is  estimated  from  training  samples  given  by 

k 

p(l(x,y)\(x,y)  e  Qi)  «  '^2,akp{li\jxk,t,k),  Vi.  (10.5) 

k= l 

The  estimation  above  is  done  by  estimating  four  sets  of  parameters:  the  number  of  sub-classes 
K,  the  weight  {<a&},  the  mean  vector  {//&},  and  the  covariance  matrix  {E^}. 

Since  classic  EM  method  cannot  estimate  the  number  of  sub-classes  K,  the  proposed  method 
uses  an  advanced  EM  method  proposed  by  Figueiredo  and  Jain  [144]  to  estimate  a  mixture  den¬ 
sity  function  from  given  data  samples,  i.e.  training  samples.  The  advanced  EM  method  based 
on  the  minimum  encoding  length  criteria  initiates  a  high  number  of  sub-classes,  and  eliminates 
the  weakest  sub-class  one  by  one,  so  it  can  automatically  select  the  number  of  sub-classes  K. 
Therefore,  all  four  sets  of  parameters,  K ,  {d&},  {/i*.},  {E&},  can  be  estimated  from  given  data 
samples  without  any  prior  knowledge  on  the  data.  That  is  an  unsupervised  density  estima¬ 
tion  method  estimates  parameters,  and  a  supervised  segmentation  method  partitions  an  image 
based  on  the  estimated  parameters.  The  detail  of  mathematic  description  of  this  density  esti¬ 
mation  method  is  presented  in  section  6.1.2,  and  the  numerical  algorithm  is  presented  in  the 
next  section. 


10.3  Algorithm 


Table  10.1:  The  input  and  output  variables  used  in  the  supervised  active  contour  model  using 
multivariate  Gaussian  mixture  densities _ 

•  Input 

—  I (x,y)  :  ft  — ►  Jft5,  a  multispectral  image. 

—  J,  the  number  of  level  set  functions  {(f)  j(x,  y)}  =  the  number  of  active 
contours  {Cj}. 

—  v,  a  parameter  to  control  the  regularity  of  contours. 

—  At,  the  time  step  of  iterative  processing. 

—  {Ii,J2,.  •  •  ,Z2j},  training  samples  for  each  subset. 

•  Output 

—  S(x,  y)  :  ft  — >  {1, . . . ,  2J},  the  segmented  image 


Table  10.1  shows  the  input  and  output  variables  used  in  the  proposed  segmentation  method. 
A  multispectral  image  I (x,y)  is  given  as  the  input  data.  The  number  of  level  set  functions, 
which  is  equivalent  to  the  number  of  contours,  is  set  as  J.  v  controls  the  regularity  of  con¬ 
tour  evolution  by  controlling  the  relative  balance  between  segmentation  force  and  regularity 
force.  At  determines  the  convergence  speed.  The  parameters  representing  each  subset  are 
estimated  from  the  corresponding  training  samples  Zj.  After  the  proposed  active  contours  con¬ 
verge,  the  algorithm  produces  the  segmented  image  S(x,  y )  indicating  each  subset  with  different 
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numbers. 


The  proposed  active  contour  algorithm  consists  of  three  stages:  training,  initialization,  and 
contour  evolution  (test).  During  the  training  stage,  the  four  sets  of  parameters  representing  the 
PDF  of  training  samples  are  estimated  for  each  subset  Qf.  the  mean  vector  the  weight 

{d&},  and  the  covariance  matrix  {£&}  of  each  sub-class,  and  the  number  K  of  sub-classes.  The 
mathematical  description  of  density  estimation  method  EMwMML(-)  is  presented  in  section  6.1.2, 
and  the  algorithm  is  presented  in  algorithm  6. 


Algorithm  5:  Supervised  Active  Contour  Model  using  Multivariate  Gaussian  Mixture 
Density  Functions 

Training 

for  i  —  1  •  •  •  2 J  do 

L  [K,  {Uk},  {A  J,  {Efc}]*  =  EMwMML(Xj) 

Initialization 
t  —  0. 

for  j  =  1  •  •  •  J  do 

Cj{x(s),y(s),t  —  0):  Initialize  contours. 

4>j(x,y,t  =  0)  =  ±D((x,y),N(Xfy)(Cj)):  Initialize  level  set  functions. 

Contour  evolution 
repeat 

for  j  =  1  •  •  •  J  do 

Kj(x,y)  =  meanCurvature(0j(x,  y)) 

L  Hj(x,y)  =  unitStep  0>j{x,y)) 

Supervised  region-based  segmentation 
for  i  —  1  •  •  •  2 J  do 

Xi(x,U )  =  identityFunction ({Hj(x,y),  Vj}) 

L  ei(x>  y)  =  -  logp(I(x,  y) \ [K,  {ak},  {fik},  {£fc}] *) 

Update  level  set  function 
for  j  =  1  •  •  •  J  do 

< Pj{x,y )  =  <pj(x,y)  +  Ai5(<^(a;,y)) 

t  =  t  +  At 

until  until  the  evolution  of  contour  converges 

Segmentation  result 

2J 

S(x,y )  =  X  Xi(x,y) 

1=1 

During  the  initialization  stage,  J  sets  of  initial  contours  {Cj}  are  given  either  manually  or 
from  a  defined  set,  and  corresponding  level  set  functions  are  initialized  as  the  signed  distance 
from  each  (x,y)  to  the  closest  contours.  During  the  iterative  contour  evolution,  the  mean  cur¬ 
vature  Kj(x,y)  and  the  unit  step  function  Hj(x,y)  are  computed  from  each  level  set  function 
<f>j{x,y).  The  mathematical  description  of  function  meanCurvature(-)  and  unitStep(-)  are  in¬ 
troduced  respectively  in  equation  3.10  and  7.3.  Then,  binary  identity  functions  Xi(xiV)  f°r 
each  subset  are  computed.  The  mathematical  description  of  function  identityFunction(-) 


2J  a 

VK(<t>j{x,y))  -  X  ei{x>y)dH- 

i= 1  3 
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is  given  in  equation  7.4.  There  is  no  density  estimation  procedure  during  the  contour  evo¬ 
lution  as  the  required  parameters  are  already  estimated  from  training  samples  Zj.  Instead, 
the  proposed  method  just  assigns  the  pre-estimated  parametric  mixture  density  functions 
p(J.(x,y)\[K,  {d&},  {/i*.},  each  subset  Qi  on  the  image,  and  the  density  functions  per¬ 
form  the  role  of  conditional  PDF  of  image  intensity.  p(I(x,y)\[K,  {d&},  updates 

the  objective  function  ei{x,y)  for  consequently  updates  the  level  set  functions  {(/)j(x,y)} 
based  on  Xi(x,y)  and  ei{xiV)- 


Table  10.2:  The  input  and  output  variables  used  in  the  estimation  of  a  mixture  of  multivariate 
Gaussian  density  functions  using  an  advanced  EM  method  with  MML 

•  Input 

—  1=  {1(1),  1(2), . . . ,  I (n), . . . ,  I(JV)},  I  €  MB,  data  samples. 

—  Kmax :  the  maximum  possible  number  of  sub-classes. 

—  Kmin :  the  minimum  possible  number  of  sub-classes. 

•  Output 

—  K\  the  estimated  number  of  sub-classes. 

—  {fak}'-  the  estimated  mean  vector  of  each  sub-class. 

—  {d^}:  the  estimated  weight  of  each  sub-class. 

—  {E&}:  the  estimated  covariance  matrix  of  each  sub-class. 


Table  10.2  shows  the  input  and  output  variables  used  in  the  function  EMwMML(-)  [144].  The 
advanced  unsupervised  EM  method  takes  data  samples  I  and  the  maximum  and  minimum 
possible  number  of  sub-classes  {Kmax,  Kmin}  as  input,  and  produces  the  four  sets  of  parame¬ 
ters  representing  a  mixture  of  multivariate  Gaussian  density  functions:  the  mean  vectors 
the  weights  {d&},  and  the  covariance  matrix  {E&}  of  each  sub-class,  and  the  number  K  of 
sub-classes. 

The  advanced  EM  method  used  in  the  proposed  active  contour  model  estimates  the  output 
values  through  an  iterative  process  consisting  of  four  steps:  expectation  (E-step),  maximization 
(M-step),  evaluation,  and  annihilation.  At  the  initialization  stage,  the  number  of  sub-classes 
is  initialized  as  Kmax.  The  three  other  parameters,  {d&},  {A&},  {E&},  are  also  initialized  as 
appropriate  values.  During  E-step,  the  expected  posterior  probabilities  of  the  given  data  sam¬ 
ples  based  on  the  current  status  of  parameters  [d^, /xfc,  E&]  are  estimated  and  updated  for 
each  sub-class  k.  During  M-step,  four  output  parameters,  [AT,  {d&},  {E&}],  are  estimated. 

Function  eliminate (•)  eliminates  the  sub-class  with  zero  weight  =  0,  adjusts  the  number 
of  sub-classes  K  —  K  —  1,  and  rearranges  the  parameters  of  remained  sub-classes.  During  the 
evaluation  step,  we  check  the  value  of  metric  function  £,  and  stop  the  iteration  if  C(t)  has 
converged.  During  the  annihilation  step,  we  eliminate  the  sub-class  with  the  lowest  weight. 
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Algorithm  6:  Advanced  EM  Algorithm  with  MML 


Initialization 

K  —  K 

1V  —  lvmax 

for  k  =  1  •  •  •  K  do 

Initialize  d&,  £ik,  Y>k. 

for  n  =  1  •  •  •  N  do  uk{n)  =  p(I(n)|/ife,  E*.) 

^(0)  —  OO,  C-'min  ~ 

Estimation 

while  K  >  Kmin  do 
while  A C  >  e  do 

k  —  1 

while  k  <  K  do 

E-step:  update  posterior  probabilities. 

for  n  =  1  •  •  •  N  do  wk(n)  =  — ^kuk(n) — 


M-step:  update  parameter  set  ©  =  [K,  {d&},  {E&}] . 

-  _  max(Q,En=i  Wk(n)-N/ 2) 

k  Ef=i(max(0,E^=1  ^  (n)-N/2)} 

{&k}  —  A/g/  &i •,  v/c 

if  d&  =  0  then 

I  [K,  { uk },  {d*J,  {Afe}5  =  eliminate  (A,  {uk},  {dfe},  {ftk}, 

else 

for  k  =  1  •  •  •  K  do 

-  _  e!Li  iw^fcW 
/iA  “  E^Wn) 

V  _  EiLi  ulfc(n)(i(w)— Afc)(i(?t)— Afc)T 
EiLi  «»*(») 

L  for  n  =  1  •  •  -N  do  uk{n)  =  p(I(n)\0k)  =  p(I(n)\jj,k,  tk) 
k  =  k  +  1 

Evaluation:  check  the  criterion  to  stop  the  iteration. 


im) 


m  =  f  ELi  log  ^ + 1  log  f2  +  -  e:=i  log  (Eti 

A£  =  \{£(t)-£(t-l)}/£(t-l)\ 

if  C-'miri  ^  then  C^rnin  ~  ©fre5£  — 

Annihilation:  eliminate  the  weakest  sub-class. 

if  K  ^  Amin  then 

d&*  —  0  where  £;*  —  arg  min  ak 

k 

Oik!  Ej=i  dj  5  Vfc 

d,  {«*}>  {£*}>  {£*}]  =  eliminated,  {«*,},  {afc},  {£&},  {£fc}) 


10.4  Experiments 

In  this  section,  we  apply  the  proposed  segmentation  method  to  three  different  images,  and  com¬ 
pare  the  results  of  the  proposed  method  to  the  results  of  other  active  contour  models.  The  same 
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region-based  active  contour  models  used  in  the  last  two  chapters  were  used  as  benchmarks.  All 
three  methods  including  the  proposed  method  preform  the  segmentation  based  on  the  statis¬ 
tical  property  of  image  intensity  measured  within  a  subset.  Method  1  and  2  are  unsupervised 
methods,  while  the  proposed  method  determines  the  statistical  information  from  training  sam¬ 
ples.  The  experiments  of  all  three  methods  use  the  same  pre-defined  initial  contours.  The 
detail  of  mathematical  and  analytical  descriptions  of  both  benchmarks  methods  are  presented 
in  section  3.4. 

Figure  10.1  shows  an  example  of  synthetic  gray  images  with  textures.  There  is  a  small 


Figure  10.1:  Synthetic  textured  image:  (a)  a  textured  gray  image,  (b)  the  ground  truth  image 

rectangle  in  figure  10.1(a),  and  both  the  rectangle  and  the  background  have  their  own  texture 
pattern.  The  object  of  this  experiment  is  to  separate  the  rectangle  from  the  background.  Fig¬ 
ure  10.1(b)  shows  the  ground  truth  of  the  given  image.  The  dark  part  presents  the  background, 
class  1,  while  the  bright  part  presents  the  rectangle,  class  2. 


Figure  10.2  shows  the  result  of  method  1  applied  on  figure  10.1.  Figure  10.2(a)  shows  the 


tplwi 


(a)  (b) 

Figure  10.2:  Method  1  applied  to  a  synthetic  textured  image:  (a)  the  final  stage  of  contour 
evolution,  (b)  the  segmented  subsets 

final  stage  of  contour  evolution,  and  figure  10.2(b)  shows  the  corresponding  segmentation  result. 
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Method  1  produces  extremely  many  artifacts  on  both  the  background  and  the  rectangle  because 
of  the  complicated  texture  patterns.  Method  1  partitions  the  image  into  dark  sub-regions  and 
bright  sub-regions,  but  they  are  not  equivalent  to  the  two  texture  patterns. 

Figure  10.3  shows  the  result  of  method  2  applied  on  figure  10.1.  Figure  10.3(a)  shows  the 


(a)  (b) 

Figure  10.3:  Method  2  applied  to  a  synthetic  textured  image:  (a)  the  final  stage  of  contour 
evolution,  (b)  the  segmented  subsets 

final  stage  of  contour  evolution,  and  figure  10.3(b)  shows  the  corresponding  segmentation  result. 
Method  2  does  not  show  many  artifacts  on  the  background,  but  cannot  extract  the  shape  of 
the  rectangle.  Figure  10.2  and  10.3  show  that  the  performance  of  unsupervised  segmentation 
methods  is  limited  if  the  image  has  non-uniform  subsets  and  different  subsets  show  similar 
image  intensities. 

We  applied  the  proposed  supervised  segmentation  method  on  the  same  image.  Figure  9.7 
shows  the  training  samples  used  for  the  proposed  method.  Two  reference  images  of  the  tex- 


(a)  (b) 


Figure  10.4:  Training  stage  of  the  proposed  method:  (a)  reference  image  for  background,  (b) 
reference  image  for  the  rectangle 

ture  patterns  are  given  as  training  samples.  Figure  10.4(a)  shows  the  training  samples  for  the 
background,  and  figure  10.4(b)  shows  the  training  samples  for  the  rectangle.  The  proposed 
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method  estimates  the  PDF  of  both  images  as  a  mixture  of  Gaussian  density  functions,  and  as¬ 
sign  them  as  the  conditional  PDF  of  image  intensity  for  each  subset.  Figure  10.5  shows  the  the 
result  of  the  proposed  method.  Figure  10.5(a)  shows  the  final  stage  of  contour  evolution,  and 


(a)  (b) 

Figure  10.5:  Proposed  method  applied  to  a  synthetic  textured  image:  (a)  the  final  stage  of 
contour  evolution,  (b)  the  segmented  subsets 

figure  10.5(b)  shows  the  corresponding  segmentation  result.  The  proposed  method  successfully 
separates  the  rectangle  from  the  background  without  any  significant  artifact. 

For  more  precise  comparison,  the  statistical  distribution  of  image  intensity  within  both  the 
background  and  the  rectangle  are  measured.  Figure  10.6  shows  the  statistical  distribution  of 
image  intensity  within  class  1,  which  is  supposed  to  be  the  background  in  the  segmentation  re¬ 
sult.  Figure  10.6(a),  10.6(b),  and  10.6(c)  respectively  present  the  results  of  method  1,  method 
2,  and  the  proposed  method.  The  blue  solid  lines  of  all  three  graphs  present  the  statistical 
distribution  of  image  intensity  measured  at  the  background,  which  is  a  mixture  of  at  least  three 
sub-classes.  The  green  solid  lines  of  all  three  graphs  present  the  same  statistics  measured  within 
class  1  in  the  segmentation  results.  As  the  green  solid  line  exists  closer  to  the  blue  solid  line,  it 
presents  better  result.  In  figure  10.6(a),  the  dotted  red  line  with  a  circle  presents  the  represen¬ 
tation  of  the  subset  used  in  method  1,  i.e.  the  mean  of  green  solid  line.  The  measured  statistics 
is  very  different  from  the  ground  truth  because  the  constant  representation  is  not  sufficient 
to  represent  the  non-uniform  sub-region  consisting  of  many  sub-classes.  In  figure  10.6(b),  the 
dotted  red  line  presents  the  representation  of  the  subset  used  in  method  2,  i.e.  the  Gaussian 
density  function  of  image  intensity  measured.  The  measured  statistics  is  similar  to  the  ground 
truth  though  the  representation  is  quite  different  from  the  true  statistics.  In  figure  10.6(c), 
the  dotted  red  line  presents  the  estimated  parametric  mixture  density  function  from  training 
samples.  The  estimated  density  function  is  very  similar  to  the  true  statistics,  and  the  measured 
statistics  is  almost  identical  to  the  ground  truth.  Figure  9.10  shows  the  same  measure  for  class 
2,  the  rectangle.  Although  the  ground  truth  looks  like  a  simple  Gaussian  distribution,  neither 
method  1  or  2  can  produce  reasonable  results,  as  shown  in  figure  10.7(a)  and  10.7(b),  because 
the  non-uniform  statistics  in  the  other  subset  was  too  complicated  for  them.  In  figure  10.7(c). 
the  proposed  method  estimates  very  accurate  statistics  from  training  samples,  and  it  leads  to 
a  very  successful  segmentation  result. 

Figure  10.8  shows  an  example  of  outdoor  gray  images  with  texture.  There  is  a  zebra  stand- 
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(c) 

Figure  10.6:  Statistical  distribution  of  image  intensity  within  background:  (a)  method  1,  (b) 
method  2,  (c)  proposed  method 


ing  on  the  field.  The  zebra  consists  of  white  and  black  stripes,  while  the  background  is  uniformly 
gray.  The  object  of  this  experiment  is  to  extract  the  shape  of  zebra  from  the  background.  This 
is  a  difficult  segmentation  problem  because  of  the  complicated  patterns  of  the  zebra  and  the 
shade  under  the  zebra.  The  white  stripes  of  zebra  are  brighter  than  the  background,  while  the 
black  stripes  of  zebra  are  darker  than  the  background.  Also,  some  portions  of  zebras,  e.g.  white 
stripes  under  the  shade,  have  similar  intensity  to  the  background.  The  shade  under  the  zebra 
is  as  dark  as  the  black  stripes  though  it  is  not  part  of  the  zebra.  The  statistical  distribution  of 
image  intensity  within  the  zebra  must  be  represented  as  a  mixture  of  more  than  at  least  two 
sub-classes. 

Figure  10.9  shows  the  result  of  method  1  applied  on  figure  10.8.  Figure  10.9(a)  shows  the 
final  stage  of  contour  evolution,  and  figure  10.9(b)  shows  the  corresponding  segmentation  result. 
Method  1  separates  the  zebra  into  two  parts,  i.e.  bright  sub-regions  and  dark  sub-regions,  but 
identifies  the  bright  sub-regions  of  the  zebra  as  the  background. 

Figure  10.10  shows  the  result  of  method  2  applied  on  figure  10.8.  Figure  10.10(a)  shows 
the  final  stage  of  contour  evolution,  and  figure  10.10(b)  shows  the  corresponding  segmenta¬ 
tion  result.  Although  method  2  produces  better  segmentation  result  than  method  1,  there  are 
still  many  artifacts  on  both  zebras  and  background.  The  sub-regions  of  zebras  whose  inten¬ 
sity  is  similar  to  the  background  are  identified  as  background,  i.e.  the  white  subset,  while  the 
sub-regions  of  background  whose  intensity  is  darker  than  the  other  parts  of  background  are 
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Figure  10.7:  Statistical  distribution  of  image  intensity  within  the  small  rectangle:  (a)  method 
1,  (b)  method  2,  (c)  proposed  method 


identified  as  zebras,  i.e.  the  black  subset. 

Figure  10.11  shows  the  training  stage  of  the  proposed  segmentation  method.  We  used  the 
same  image  as  the  training  samples  for  this  experiment.  The  yellow  rectangles  indicate  the 
regions  used  as  training  samples.  The  statistical  distribution  of  image  intensity  within  those 
sampled  regions  {p(I\I  G  Ti)}  is  estimated  as  a  mixture  of  Gaussian  density  functions,  and 
assigned  as  the  conditional  PDF  of  image  intensity  as  the  background  p(I(x,y)\(x,y)  G  Q±) 
and  the  zebra  p(I(x,  y)\(x,  y)  G  Figure  10.12(a)  shows  the  final  stage  of  contour  evolution, 
and  figure  10.12(b)  shows  the  corresponding  segmentation  result.  Although  there  are  still  a  few 
artifacts  on  the  zebra,  the  number  has  been  significantly  reduced.  There  are  no  artifacts  in  the 
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(a) 


(b) 


Figure  10.9:  Method  1  applied  to  a  complicated  outdoor  image:  (a)  the  final  stage  of  contour 
evolution,  (b)  the  segmented  subsets 


(a)  (b) 

Figure  10.10:  Method  2  applied  to  a  complicated  outdoor  image:  (a)  the  final  stage  of  contour 
evolution,  (b)  the  segmented  subsets 


(a)  (b) 

Figure  10.11:  Training  stage  of  the  proposed  method:  (a)  training  samples  for  background,  (b) 
training  samples  for  the  zebra 


background.  The  artifacts  found  within  the  zebra  appear  because  the  image  intensity  of  those 
sub-regions  is  identical  to  the  background.  This  experiment  shows  how  robust  the  proposed 
method  using  mixture  density  functions  is  compared  to  traditional  methods  using  unimodal 
density  functions,  and  it  also  shows  the  limitation  of  the  image  segmentation  method  based  on 
image  intensity  as  it  cannot  separate  two  objects  with  identical  image  intensities. 
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(a)  (b) 

Figure  10.12:  Proposed  method  applied  to  a  complicated  outdoor  image:  (a)  the  final  stage  of 
contour  evolution,  (b)  the  segmented  subsets 


Figure  10.13  shows  an  example  of  applications  using  the  proposed  method  on  indoor  RGB 
images.  The  object  of  this  experiment  is  to  simulate  ATR  problems  with  real  RGB  images  in- 


(b)  (c) 

Figure  10.13:  Complicated  indoor  RGB  images:  (a)  reference  image,  (b)  test  1,  (c)  test  2 

stead  of  synthetic  images.  We  measure  training  samples  from  a  reference  image,  and  apply  the 
proposed  method  with  the  training  samples  into  other  images  taken  under  different  light  condi¬ 
tion  and  angle.  We  applied  them  into  the  images  with  even  unknown  objects.  Figure  10.13(a) 
shows  the  reference  image.  There  is  a  toy  tank  on  the  carpet,  and  the  carpet  has  a  wool  pat¬ 
tern.  This  image  simulates  a  miniature  version  of  a  tank  on  the  grass  field.  Figure  10.13(b) 
and  10.13(c)  show  the  two  test  images.  The  reference  image  figure  10.13(a)  is  taken  from  di¬ 
rectly  above  the  target  simulating  a  view  from  satellite  or  high- altitude  reconnaissance  flight, 
while  the  first  test  image  figure  10.13(b)  is  taken  from  a  diagonal  view  and  the  focus  is  little  bit 
blurred  simulating  a  view  from  a  missile  or  a  jet  fighter.  The  second  test  image  figure  10.13(c) 
shows  another  target,  a  humvee,  as  well  as  the  tank  shown  in  the  reference  image.  There  is  also 
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a  gray  painted  vehicle  in  the  figure  10.13(c),  which  simulates  a  non-target  object.  This  is  the 
most  difficult  problem  among  the  experiments  shown  in  this  document  because  the  reference 
image  and  test  images  were  taken  from  different  views,  and  we  do  not  have  information  for  all 
objects  appeared  in  the  test  images,  i.e.  the  humvee  and  the  gray  painted  vehicle. 

Figure  10.11  shows  the  training  stage  of  the  proposed  segmentation  method.  We  obtain 


(a)  (b) 

Figure  10.14:  Training  stage  of  the  proposed  method:  (a)  training  samples  for  background,  (b) 
training  samples  for  the  tank 

some  training  samples  from  the  reference  image.  The  yellow  rectangles  indicate  the  regions 
used  as  training  samples.  The  statistical  distribution  of  image  intensity  within  those  sampled 
regions  {p(I|I  G  li)}  is  estimated  as  a  mixture  of  multivariate  Gaussian  density  functions,  and 
assigned  as  the  conditional  PDF  of  image  intensity  as  the  background  p(I(x,  y)  |(x,  y)  G  Q\)  and 
the  target  p(I(x,y)\(x,y)  G  D2). 

Figure  10.15  shows  the  result  of  the  proposed  method  applied  on  figure  10.13(b).  Figure  10.15(a) 


(a)  (b) 

Figure  10.15:  Proposed  method  applied  to  test  1  image:  (a)  the  final  stage  of  contour  evolution, 
(b)  the  segmented  subsets 

shows  the  final  stage  of  contour  evolution,  and  figure  10.16(b)  shows  the  corresponding  seg¬ 
mentation  result.  The  object  of  this  experiment  is  to  extract  the  shape  of  the  tank  from  the 
background,  and  the  proposed  method  produces  a  reasonable  result.  Although  there  are  a  few 
small  artifacts  on  the  result,  a  smarter  choice  of  training  samples  may  solve  the  problem. 
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Figure  10.16  shows  the  result  of  the  proposed  method  applied  on  figure  10.13(c).  Figure  10.16(a) 


(a)  (b) 

Figure  10.16:  Proposed  method  applied  to  test  2  image:  (a)  the  final  stage  of  contour  evolution, 
(b)  the  segmented  subsets 

shows  the  final  stage  of  contour  evolution,  and  figure  10.16(b)  shows  the  corresponding  segmen¬ 
tation  result.  The  object  of  this  experiment  is  to  extract  the  shape  of  the  tank  and  the  humvee, 
but  to  ignore  the  non-target  object,  i.e.  the  gray  painted  car.  There  was  no  such  information 
on  the  non-target  object  in  the  reference  image.  The  proposed  method  successfully  extracts  the 
shape  of  the  tank  and  the  humvee,  while  it  ignores  the  non-target  object.  There  is  an  artifact 
on  the  hood  of  the  humvee  because  the  white  star  did  not  appear  in  the  reference  image.  The 
segmentation  result  is  little  bit  affected  by  shading  and  different  light  condition.  The  shadows 
of  the  gray  painted  vehicle  are  recognized  as  targets,  and  a  bright  portion  of  the  tank  is  recog¬ 
nized  as  the  background.  This  experiment  shows  the  potential  use  of  the  proposed  method  in 
real  ATR  problems  and  the  possible  problems  that  should  be  improved  by  future  research. 
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Chapter  11 

Improvements  in  Unsupervised 
Active  Contour  Model  using  an 
Expanded  Feature  Space 


In  this  chapter,  we  propose  some  improvements  over  the  unsupervised  image  segmentation 
method  using  Multi-dimensional  Histograms  introduced  in  chapter  8.  The  feature  space  is 
expanded  to  include  3  more  features  apart  from  image  intensity.  These  are  then  non-linearly 
diffused  before  using  them  to  measure  a  PDF  expression  for  each  subset  in  the  image  at  each 
iteration  and  update  the  level  set  function.  Section  11.1  discusses  feature  extraction  procedure 
to  compute  the  new  expanded  set  of  features.  Section  11.2  discusses  the  PDF  modeling  of 
the  new  feature  set  for  improved  unsupervised  image  segmentation.  Section  11.3  shows  the 
numerical  algorithm  and  section  11.4  shows  results  on  some  real  images. 


11.1  Feature  Extraction 


The  proposed  improvement  is  based  on  expanding  the  feature  set  to  include  features  from  the 
modified  classical  structure  tensor  as  proposed  in  Rousson  and  Derichel  [164].  The  classical 
structure  tensor  is  defined  below: 


Jp  =  Kp*  (V/VIT)  = 


KP*I2X 

K p  *  Ixly 


Kp  *  IxIy 

K,  .  Jj 


(11.1) 


where  Kp  is  a  Gaussian  kernel  with  standard  deviation  p  and  the  subscripts  denote  partial 
derivatives  of  the  image.  We  replace  the  convolution  with  the  Gaussian  kernel  with  non-linear 
diffusion  as  proposed  in  [164].  Using  a  Gaussian  kernel  results  in  the  smoothing  of  the  edges 
and  thus,  leads  to  inaccurate  segmentation  near  the  boundaries.  Non-linear  diffusion  avoids 
the  problem  by  reducing  the  smoothening  near  the  edges. 

Thus,  our  feature  space  is  given  by  the  non-linearly  diffused  version  of: 

U  =[  /  42  I2y  4  Iyf  (11.2) 

Note  that  u  is  a  vector  valued  image. 

We  use  the  vector  valued  non-linear  diffusion  scheme  as  proposed  in  [164]. 


N 

dtUi  =  div(g(^2  |Vufc| 2)Viti)  Vi 

k= 1 


(11.3) 
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All  the  channels  are  coupled  in  the  diffusivity  function  g(s).  Therefore,  an  edge  in  one 
channel  inhibits  smoothing  in  the  others.  The  diffusivity  function  g(s)  is  defined  as: 

»(») = V  <n'4) 

s  +  e 

where  e  is  a  small  constant  to  avoid  division  by  zero. 


11.2  Probability  Density  Estimation 

The  probability  density  function  for  the  contour  evolution  is  estimated  using  the  expanded 
feature  set  rather  than  just  the  multi-dimensional  histogram  of  the  image  intensity.  The  four 
feature  vectors  are  assumed  to  be  independent  of  each  other.  The  feature  vector  /  is  estimated 
using  a  multi-dimensional  histogram  as  proposed  in  chapter  8. 

The  other  3  feature  vectors  are  approximated  as  a  multidimensional-Gaussian  distribution 
with  parameters  {/i^,  for  each  subset  of  the  image.  These  are  updated  at  every  iteration 
during  the  evolution  of  the  contour  as  follows: 


IH  =  fu(x)Xidx/jXidx 


Ei  =  /  (“(*)  -  w)  («(*)  -  mVxidx  / jx.dx 


(11.5) 


where  Xi  is  the  binary  identity  function  defined  in  chapter  7. 

Note  that  the  Covariance  matrix  (X^)  is  diagonal  because  of  the  independence  of  each  feature 
channel. 

The  total  probability  density  function  for  each  subset  of  the  image(f^)  is  then  estimated  as 
follows: 


Pi{u(x))  =  Y[pi{uk{x )) 


(11.6) 


The  proposed  method  uses  the  same  global  energy  minimization  scheme  as  proposed  in 
chapter  8.  The  global  energy  function  is  given  by 


E({Pi},C )  =  Yl  [  e(x,y\pi)dxdy  +  u\C\,  (11.7) 

*  J 

and  the  global  energy  E  is  minimized  with  respect  to  two  expressions:  {pi},  i.e.  the  statistical 
representation  of  the  image  within  a  subset  and  the  variational  contours  C.  First,  the 
minimum  E  with  respect  to  {pi}  is  achieved  while  C  is  fixed.  Then,  the  minimum  E  with 
respect  to  C  is  achieved  while  {pi}  are  fixed.  An  objective  function 

e(x,y\pi)  =  —  log  pi,  V(x,s/)6n,  Vi,  (11.8) 

determines  the  region-based  segmentation  criteria  .  The  probability  density  function  pi  for  each 
subset (Qi)  is  estimated  by  using  equation  11.6. 
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11.3  Algorithm 


The  contour  evolution  algorithm  is  similar  to  the  one  proposed  in  chapter  8.  There  are  however, 
some  differences.  One  is  the  feature  extraction  step  and  the  other  is  in  the  computation  of  the 
objective  function.  Table  11.1  shows  the  input  and  output  variables  used  in  the  algorithm  and 
the  modified  algorithm  is  shown  in  algorithm  7. 

Table  11.1:  The  input  and  output  variables  used  in  the  unsupervised  image  segmentation  with 
the  expanded  feature  space 

•  Input 

—  I (x,y)  :  Q  — ►  SR1,  a  greyscale  image. 

—  J,  the  number  of  level  set  functions  {</>j(x,  y)}  =  the  number  of  active 
contours  {Cj}. 

—  z/,  a  parameter  to  control  the  regularity  of  contours. 

—  At,  the  time  step  of  iterative  processing. 

•  Output 

—  S{x ,  y)  :  ft  — >  {1, . . . ,  2J},  the  segmented  image 


Because  of  the  expanded  feature,  the  probability  density  is  estimated  by  a  Multi-dimensional 
Histogram  for  feature  u\  which  is  the  diffused  version  of  the  image  intensity (/).  The  other 
features  approximated  by  Gaussian  distribution. 


95 


Algorithm  7:  Unsupervised  Active  Contour  Model  using  and  Expanded  Feature  Space 


Initialization 
t  —  0. 

u  =  FeatureExtract(I(x,  y)) 
for  j  =  1  •  •  •  J  do 

Cj{x(s),y(s),t  —  0):  Initialize  contours. 

(f)j{x,y,t  =  0)  =  ±D((x,  y),  ^(Xiy)(Cj))-  Initialize  level  set  functions. 

Contour  evolution 
repeat 

for  j  =  1  •  •  •  J  do 

Kj(x,y)  =  meanCurvature((/)j(x,  y)) 

L  Hj(x,y)  =  unitStep(0j(a:,  y)) 

Unsupervised  region-based  segmentation 
for  i  —  1  •  •  •  2J  do 

Xi{x,y)  =  identityFunction({/fj(a;, y),  Vf}) 

Pi  =  histogramDensity(«i(a;,  y)Xi(x,  y)) 
for  k  =  2  •  •  •  4  do 

Pi,k  =  mean  (uk,Xi(x,y)) 

=  Variance(itfe,Xi(»,y)) 

Pi=Pi  *  GaussianEstimate(«fc(a;,  y),  y, ,  Sj) 

L  ei(x,y)  =  -log  pi 


Update  level  set 

function 

for  j  =  1  •  •  •  J  do 

i 

II 

_ i 

c,y)  +  AtS(<f>j(a 

2J 

v,y))  vK((f>j(x,y))  -  X  e»l 
2=1 

I  t  =  t  +  At 

until  until  the  evolution  of  contour  converges 

Segmentation  result 

S(x,y)  =  X  Xi(x,y) 

2=1 

Table  11.2  shows  the  input  and  output  variables  used  in  the  feature  extraction  procedure 
and  the  actual  algorithm  is  shown  in  algorithm  8. 


Table  11.2:  The  input  and  output  variables  used  in  feature  extraction 


•  Input 

-  I(a 

-,y) 

n  -*■  k1 

,  a  greyscale  image. 

•  Output 

—  u  : 

n  - 

,1x4},  the  feature  vectors 

The  feature  extraction  procedure  involves  the  computation  of  the  image  gradient.  The 
gradients  are  then  used  to  compute  the  feature  channels  (u  ).  u  is  finally,  non-linearly  diffused 
to  obtain  the  desired  feature  vector (u). 
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Algorithm  8:  Feature  Extraction 


[Ix,Iy\  —  ComputelmageGradient(I) 
u'=  [Ull2yJXIy]T 
u  =  NonLinearDif fusion(u  ) 


11.4  Results 

In  this  section,  we  apply  the  proposed  method  to  the  zebra  image  to  highlight  the  significant 
improvements  obtained  over  the  method  proposed  in  chapter  8. 

Figure  11.1  shows  the  zebra  image  on  which  the  proposed  method  was  tested  on.  Feature 


Figure  11.1:  The  Zebra  Test  Image 

extraction  was  performed  on  the  above  image  by  the  algorithm  proposed  in  section  11.1.  Figure 
11.2  shows  the  four  feature  channels  obtained  after  feature  extraction  and  non-linear  diffusion. 
These  form  the  feature  vector  u. 


Figure  11.2:  The  4  smoothened  feature  channels (1 , 1%,  Iy,  IxIy) 

Figure  11.3  shows  the  final  stages  of  contour  evolution. 

As  it  can  be  seen,  the  proposed  method  successfully  segments  the  zebra  from  the  background. 
The  results  show  a  significant  improvement  over  the  previous  methods.  Furthermore,  the  new 
algorithm  is  unsupervised. 
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Figure  11.3:  The  Final  Stage  of  Contour  Evolution 


Thus,  the  expanded  feature  space  coupled  with  the  unsupervised  segmentation  algorithm 
proposed  in  chapter  8  overcomes  the  earlier  drawback  that  it  cannot  separate  objects  with 
similar  image  intensities. 

The  current  approach  is,  however,  by  no  means  complete  and  has  been  tested  only  on  gray 
scale  images.  The  proposed  approach  could  be  extended  to  multi-spectral  images  by  future 
research. 
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Chapter  12 

Conclusions 


The  proposed  work  has  two  main  objects:  robust  image  segmentation  and  the  integration  of 
image  segmentation  and  pattern  classification.  Although  image  segmentation  is  a  fundamental 
problem  in  image  analysis,  it  has  been  difficult  for  traditional  image  segmentation  methods  to 
produce  satisfying  results  on  images  with  non-uniform  sub-regions.  We  have  developed  robust 
image  segmentation  methods,  which  can  produce  good  segmentation  results  on  difficult  image 
segmentation  problems  that  traditional  segmentation  methods  cannot  produce  reasonable  re¬ 
sults.  In  traditional  image  analysis,  image  segmentation  performs  a  role  of  preprocessing  for 
pattern  classification  providing  the  input  data.  We  have  integrated  image  segmentation  and 
statistical  pattern  classification  by  using  the  statistical  distribution  of  image  intensity  as  the 
feature  used  in  image  segmentation.  Thus,  the  proposed  image  segmentation  methods  produce 
the  result  of  statistical  pattern  classification  as  well  as  image  segmentation  reducing  the  com¬ 
putation  time  and  effort. 

The  proposed  image  segmentation  methods  use  the  framework  of  active  contours.  As  active 
contours  always  provide  continuous  boundaries  of  sub-regions,  they  can  produce  more  reason¬ 
able  segmentation  results  than  traditional  segmentation  methods,  and  consequently  improve 
the  final  results  of  image  analysis.  The  mathematical  implementation  of  the  proposed  active 
contour  models  is  accomplished  using  level  set  method.  By  presenting  contours  as  a  level  of  a 
topological  function,  we  can  merge  multiple  contours  into  one  contour,  or  can  split  a  contour 
into  multiple  contours,  providing  a  good  flexibility  in  the  use  of  active  contours. 

Traditional  image  segmentation  methods  partition  an  image  based  on  either  discontinuity 
between  sub-regions  or  uniformity  within  a  sub-region  of  a  desired  feature,  e.g.  image  intensity, 
texture.  Instead  of  a  simple  feature,  the  proposed  image  segmentation  methods  use  the  sta¬ 
tistical  distribution  of  image  intensity  within  a  sub-region,  in  a  form  of  conditional  probability 
density  functions,  as  the  feature  to  determine  the  segmentation  criteria.  Since  the  segmentation 
criteria  are  determined  by  the  statistics  of  large  samples  rather  than  a  single  image  intensity 
at  a  pixel,  the  proposed  image  segmentation  methods  show  robust  performance  on  images  with 
noise.  The  proposed  image  segmentation  methods  partition  an  image  in  two  ways:  unsuper¬ 
vised  or  supervised.  The  unsupervised  method  partitions  an  image  based  on  the  uniformity 
of  the  feature  within  a  sub-region.  The  supervised  method  partitions  an  image  based  on  the 
similarity  of  the  feature  within  a  sub-region  to  the  feature  provided  from  corresponding  training 
samples.  The  unique  character  of  the  proposed  methods,  which  helps  them  to  outperform  tradi¬ 
tional  methods,  is  the  use  of  a  mixture  density  function  to  represent  the  statistical  distribution 
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of  image  intensity  within  a  subset.  As  the  segmentation  criteria  is  determined  by  a  mixture 
of  sub-probability  density  functions  instead  of  a  probability  density  function,  the  proposed 
methods  are  able  to  recognize  subsets  with  non-uniform  (multimodal)  image  intensity,  such  as 
zebras.  For  multispectral  images,  we  use  a  mixture  of  multivariate  density  functions  instead 
of  a  product  of  marginal  mixture  density  functions  to  represent  the  statistics  of  complicated 
patterns.  These  methods  recognize  extremely  non-uniform  vector- valued  image  intensities,  such 
as  military  camouflage  patterns,  very  well. 

As  the  application  of  the  proposed  segmentation  principle,  three  image  segmentation  meth¬ 
ods  are  proposed.  First,  we  represent  the  statistics  of  image  intensities  within  each  subset  during 
the  contour  evolution  in  a  form  of  multi-dimensional  histograms.  The  use  of  a  non-parametric 
discrete  function  provides  an  improved  segmentation  results  compared  to  traditional  segmenta¬ 
tion  methods  based  on  particular  stochastic  model  because  the  statistics  of  image  intensity  can 
be  hardly  represented  as  a  simple  stochastic  model.  Second,  we  represent  the  same  statistics  as 
a  mixture  of  multivariate  Gaussian  density  functions  during  the  contour  evolution,  using  a  prior 
knowledge  provided  from  training  samples.  The  use  of  supervised  mixture  density  functions 
provides  significant  improvements  on  segmentation  and  classification  results  because  proposed 
method  can  recognize  extremely  non-uniform  subsets.  This  method  provides  reasonable  results 
even  on  the  images  that  the  previous  method  cannot  segment  well.  Third,  instead  of  computing 
the  statistics  of  image  intensity  during  contour  evolution,  we  compute  the  statistics  of  corre¬ 
sponding  training  samples  in  advance  before  image  segmentation,  and  assign  the  pre-computed 
statistics  as  the  conditional  PDF  of  image  intensities  for  each  subset  during  image  segmenta¬ 
tion.  This  method  provides  reasonable  results  even  on  images  that  the  two  previous  proposed 
methods  cannot  segment  well.  The  Unsupervised  method  was  further  improved  by  expanding 
the  feature  space  to  include  the  image  gradients.  Non-linear  diffusion  was  used  to  smoothen 
the  feature  vector  and  then  used  to  evolve  the  contour.  This  method  produces  even  better 
results  than  the  previous  three  methods.  The  biggest  advantage  of  this  method  is  that  it  is 
unsupervised.  However,  the  method  has  been  tested  only  on  grey  scale  images  given  the  time. 
Future  work  would  be  extending  this  method  to  multi-spectral  images. 

In  order  to  show  the  improved  performance  of  the  proposed  methods,  we  have  applied  them  to 
images  that  other  segmentation  methods  cannot  properly  segment.  As  benchmarks,  the  seg¬ 
mentation  results  of  proposed  methods  and  the  results  of  two  more  traditional  active  contour 
models,  which  share  similar  segmentation  criteria,  are  compared.  The  proposed  methods  have 
shown  very  promising  results  outperforming  the  reference  methods. 

Although  the  proposed  methods  produce  very  robust  and  promising  results,  there  are  still  a 
few  aspects  could  be  improved.  Since  the  proposed  methods  represent  the  statistical  distribution 
of  image  intensity  in  a  complicated  form,  i.e.  a  mixture  of  multivariate  density  functions,  the 
required  computation  is  relatively  larger  than  traditional  methods  increasing  the  convergence 
time.  The  fast  advance  of  computer  processor  may  solve  this  problem.  Since  supervised  methods 
extract  information  from  training  samples,  the  performance  may  rely  on  the  quality  of  training 
samples.  A  smart  choice  of  training  samples  may  reduce  the  effect  of  this  problem. 

The  expanded  feature  space  provides  a  significant  improvement  in  performance.  However, 
this  method  was  not  explored  in  detail  given  the  time  and  can  be  the  basis  for  future  work. 
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